Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: simulating random numbers from zero inflated negative binomial estimates

From   "E. Paul Wileyto" <>
Subject   Re: st: simulating random numbers from zero inflated negative binomial estimates
Date   Fri, 03 Jun 2011 10:27:57 -0400

I've never used predict with the ir option, but I assume it predicts a mean incidence rate GIVEN that class membership is not an inflated zero. I suspect that it will not include the natural variability of the outcome, let alone zero-inflation. What our simulation does is take those predicted linear model, adds in the natural variability for negative binomial, and then adds zero-inflation on top of it, all to reflect the natural variation you would see.

In order to gauge whether the estimate is working well, you should simulate the data multiple times, and generate means for the point estimates, and coverage probabilities. What we did was to take our original model, assumed the estimated parameters are true, and then used them to simulate only one more data set. Repeat that last step 200x, and see how often your CI includes your true value.

Looking at the script again. This first part grabs the estimates from fitting your data:

zinb cignums drug  week, inf(drug  week)
predict p1 , pr
predict p2 , xb
predict lnalpha , xb eq(#3)
gen alph=exp(lnalpha)

This next part simulates the data and should be repeated many times.

gen xg=rgamma(1/alph, alph*p2)
gen pg=rpoisson(xg)
gen zi=runiform()>p1
gen newcigs=zi*pg

zinb newcigs drug  week, inf(drug  week)

There are many ways to collect the parameter estimates and CI's from the simulations. I'll leave that to you.


On 6/3/2011 1:03 AM, Ari Samaranayaka wrote:
Dear Paul
Thank you very much for the great help. Your are the first person to answer my question. Your answer works, and I understood the logic you used in your codes. Simulated random variates goes quite closely with observed data. I interpret this as a reasonable model fit. Great. Thank you.

I expected whenever the ZINB model fit is reasonably good, if we use the ZINB postestimation predict command to produce predicted numbers, those predicted numbers also should goes closely with observed data.
For example, if I use the command
predict expec, ir
then distribution of resultant values in "expec" should have similar distribution to observed data (because we do not specify an "exposure" in our model). However those 2 distributions quite different. Did I misinterpret the result from predict command.
Thank you again

E. Paul Wileyto, Ph.D.
Senior Research Investigator, Department of Biostatistics&  Epidemiology
Director of Biostatistics, Tobacco Use Research Center
School of Medicine, U. of Pennsylvania
3535 Market Street, Suite 4100
Philadelphia, PA  19104-3309

Fax: 215-746-7140

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index