[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"James Shaw" <shawjw@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Simulating multilevel data in Stata |

Date |
Tue, 23 Dec 2008 16:10:13 -0600 |

Stas and Alan: After my last e-mail, I modified Alan's initial code to produce the following, which seems to work. The resulting variable that represents the unit-specific effects is time-invariant. *** drop _all set obs 500 gen n = _n gen u=invnorm(uniform()) expand 3 sort n gen n2 = _n gen t= (n2 - (n-1)*3) matrix V = I(3) matrix V[1,2]=.5 matrix V[2,1]=.5 matrix V[1,3]=.0 matrix V[3,1]=.0 drawnorm X1 X2 e,cov(V) matrix V = I(3) matrix V[1,2]=.3 matrix V[2,1]=.3 matrix V[1,3]=.0 matrix V[3,1]=.0 matrix V[2,3]=.5 matrix V[3,2]=.5 matrix accum A = u X1 X2 ,noc dev matrix A=(1/1499)*A matrix H=cholesky(V) matrix G=cholesky(A) matrix GI=inv(G) matrix HGI=H*GI gen y3=HGI[1,1]*u gen y1=HGI[2,1]*u+HGI[2,2]*X1 gen y2=HGI[3,1]*u+HGI[3,2]*X1+HGI[3,3]*X2 -- Jim On Tue, Dec 23, 2008 at 3:28 PM, Stas Kolenikov <skolenik@gmail.com> wrote: > Here's what I would suggest. > > X1, X2 and u have a multivariate normal distribution, say (unless you > want to study the effects of non-normality, as well). Let us write the > covariance matrix as Sigma, and block it as Sigma_11, Sigma_12 and > Sigma_22 so that the upper block corresponds to the time varying X > variables, and the lower block, to the panel effect u: > > mata: > Sigma = (1, 0.5, 0.3 \ 0.5, 1, 0 \ 0.3, 0, 1) > Sigma11 = Sigma[|(1,1) \ (2,2)|] > Sigma21 = Sigma[|(3,1) \ (3,2)|] > Sigma22 = Sigma[3,3] > > // 1. Sample u from say standard normal distribution: > u = rnormal(1,1,0,1) > > // 2. Figure out the conditional distribution X1, X2 | u: the mean is > mean_u = invsym(Sigma11) * Sigma21' * u > Sigma11_u = Sigma11 - Sigma21' * Sigma21 / Sigma22 > chS = cholesky(Sigma11_u) > > // 3. Draw normal X1 and X2 > X = rnormal(1,2,0,1) > X = X*chS > > // I hope I did not mess up with all the transposes and such :)) > > // Now, this will generate a triplet of u, X1, X2. > // You can repeat step 3 a bunch of times to get T values, > // and that's doable simply by replacing the lines to > X = rnormal(T,2,0,1) > X = X*chS > > // You will also need to propagate the values of u: > U = u*J(3,1,1) > > // Finally you can add (standard normal) e on top of everything: > E = rnormal(3,1,0,1) > > // Now, a simulated panel will look like > X, U, E > > end > // of Mata > > You now will have to repeat the process N times, and post everything > back to Stata using st_view(). Most of this stuff is of course doable > in strict Stata, as well. The critical steps are matrix blocking and > Cholesky decomposition. You can just do those for your fixed > covariance matrix of interest, and hard code the numbers into some > -generate- and -replace- statements. > > On Tue, Dec 23, 2008 at 10:55 AM, James Shaw <shawjw@gmail.com> wrote: >> Dear Statalist members, >> >> I want to perform a simulation to show the inconsistency of the OLS >> and random effects estimators when one of the regressors is correlated >> with the unit-specific error component. The specifics of the >> simulation are as follows; >> >> Y[i,t] (the outcome to be modeled) = b0 + b1*X1[i,t] + b2*X2[i,t] + >> u[i] + e[i,t] >> >> i = 1,...,500 indexes subjects >> t = 1,...,3 indexes time (repeated observations on subjects) >> >> X1 and X2 are normally distributed random variables with arbitrary >> means and variances >> u is a normally distributed subject-specific error component with mean >> of 0 and arbitrary variance >> e is a normally distributed random error component with mean of 0 and >> arbitrary variance >> >> corr(X1,X2) = 0.5 >> corr(X1,u) = 0.3 >> corr(X2,u) = 0.0 >> corr(X1,e) = corr(X2,e) = corr(u,e) = 0.0 >> >> b0, b1, and b2 are parameters to be specified in the simulation >> >> >> I have been unable to identify a method that will ensure that >> corr(X1,u) equals the desired value. I tried the following method in >> which u was generated separately from X1 and X2 and cholesky >> decomposition was applied to generate transformations of the three >> random variables that would exhibit the desired correlations. >> However, this yielded a non-zero correlation between X2 and u. >> >> Method 1 >> *** >> drop _all >> set obs 500 >> gen n = _n >> gen u=invnorm(uniform()) >> expand 3 >> sort n >> gen n2 = _n >> gen t= (n2 - (n-1)*3) >> drawnorm x1 x2 e >> sort n >> mkmat x1 x2 u e, matrix(X) >> mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1) >> mat X2 = X*cholesky(c) >> *** >> >> A method that yielded somewhat better results involved generating X1, >> X2, u, and e with a pre-specified correlation matrix and then >> collapsing u so that it varied by subject only. This provided the >> correct values for corr(X1,X2) and corr(X2,u) but attenuated the >> correlation between X1 and u. I presume that I could simply specify a >> higher value for corr(X1,u) when generating the variables so that the >> desired value would be achieved after u is collapsed. However, this >> would not be the most elegant solution. >> >> Method 2 >> *** >> drop _all >> set obs 500 >> mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1) >> gen n = _n >> expand 3 >> sort n >> gen n2 = _n >> gen t= (n2 - (n-1)*3) >> drawnorm x1 x2 u e, corr(c) >> sort n >> by n: egen u2 = mean(u) >> *** >> >> Any suggestions or references would be appreciated. >> >> Regards, >> >> Jim >> >> >> James W. Shaw, Ph.D., Pharm.D., M.P.H. >> Assistant Professor >> Department of Pharmacy Administration >> College of Pharmacy >> University of Illinois at Chicago >> 833 South Wood Street, M/C 871, Room 252 >> Chicago, IL 60612 >> Tel.: 312-355-5666 >> Fax: 312-996-0868 >> Mobile Tel.: 215-852-3045 >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > > > -- > Stas Kolenikov, also found at http://stas.kolenikov.name > Small print: I use this email account for mailing lists only. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 252 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Simulating multilevel data in Stata***From:*"James Shaw" <shawjw@gmail.com>

**Re: st: Simulating multilevel data in Stata***From:*"Stas Kolenikov" <skolenik@gmail.com>

- Prev by Date:
**Re: st: Simulating multilevel data in Stata** - Next by Date:
**Re: st: Confidence interval for a median with weighted, clustered data** - Previous by thread:
**Re: st: Simulating multilevel data in Stata** - Next by thread:
**st: Confidence interval for a median with weighted, clustered data** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |