Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Count models and fractional variables

From   Fabiana Visentin <>
Subject   st: Count models and fractional variables
Date   Mon, 12 Mar 2012 18:52:56 +0100

on behalf of a friend:


I have done a zero inflated negative binomial. To be clear I am
studying the number of patents of an inventor during the period spent
in a company. Of course, on the left hand side of the equation I
control for the number of years the inventor spent in the company. I
also include the squared of the period, to see if the number of patent
increases or decreases for inventors in the company from a longer
period. Then I have other variables of interests. Using the total
number of patents and controlling for the period everything works

Now, this should be equivalent, in principle, to study the average
number of patents per year (instead of the total number). In this case
there would be no need to control for the period (and in the left hand
side I only have the period and not the square period).
The main issue is that in this way the dependent variable is not a
count variable anymore but it has fractional numbers.
Therefore I encountered two main problems:
-	I used a TOBIT model (left-censured in 0). Results are similar but I
lose quite a bit of significance in the results.
-	Initially I have accidentally run again the zero inflated binomial
for the averaged variable. What is puzzling is that the model still
works with good and coherent results. I thought STATA automatically
rounded the dependent variable, whether with round() or int().. but I
tried myself to change the dependent variable in this way  and results
are not the same. For the sake of curiosity I have also multiplied the
dependent variable for 10000 in order not to have decimal numbers
anymore but keep this ‘information’. This leads to a fourth and
different set of results. Therefore STATA seems to account for the
decimals in the dependent variable but not fully (?).

Have you any explanation for this?
More in general, do you have any suggestion how to address my problem?
( i.e the right model to study the average number of patents)

Thank you for your attention.

Best regards,

Stefano H. Baruffaldi


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index