Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Fabiana Visentin <fabiana.visentin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Count models and fractional variables |

Date |
Mon, 12 Mar 2012 18:52:56 +0100 |

on behalf of a friend: Hi, I have done a zero inflated negative binomial. To be clear I am studying the number of patents of an inventor during the period spent in a company. Of course, on the left hand side of the equation I control for the number of years the inventor spent in the company. I also include the squared of the period, to see if the number of patent increases or decreases for inventors in the company from a longer period. Then I have other variables of interests. Using the total number of patents and controlling for the period everything works fine. Now, this should be equivalent, in principle, to study the average number of patents per year (instead of the total number). In this case there would be no need to control for the period (and in the left hand side I only have the period and not the square period). The main issue is that in this way the dependent variable is not a count variable anymore but it has fractional numbers. Therefore I encountered two main problems: - I used a TOBIT model (left-censured in 0). Results are similar but I lose quite a bit of significance in the results. - Initially I have accidentally run again the zero inflated binomial for the averaged variable. What is puzzling is that the model still works with good and coherent results. I thought STATA automatically rounded the dependent variable, whether with round() or int().. but I tried myself to change the dependent variable in this way and results are not the same. For the sake of curiosity I have also multiplied the dependent variable for 10000 in order not to have decimal numbers anymore but keep this ‘information’. This leads to a fourth and different set of results. Therefore STATA seems to account for the decimals in the dependent variable but not fully (?). Have you any explanation for this? More in general, do you have any suggestion how to address my problem? ( i.e the right model to study the average number of patents) Thank you for your attention. Best regards, Stefano H. Baruffaldi stefano.baruffaldi@epfl.ch -- Fabiana * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: mahascore2, mahapick** - Next by Date:
**RE: st: Inverse Cummulative Variable** - Previous by thread:
**st: mahascore2, mahapick** - Next by thread:
**RE: st: Count models and fractional variables** - Index(es):