[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"James Shaw" <shawjw@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Simulating multilevel data in Stata |

Date |
Tue, 23 Dec 2008 12:58:14 -0600 |

Thanks for your prompt response. The code you sent does a great job at producing the desired correlations. The only problem is that it produces a variable u that is time-varying. The variable u is meant to represent a random effect. As such, it should have a mean of 0 and vary by subject only (i.e., be time-invariant). The variables X1 and X2 should vary by subject (i) and time (t). The final product needs to be a time-invariant u that is correlated 0.5 with time-varying X1 and 0.0 with time-varying X2. Is there a way to modify the code to get the desired correlations? -- Jim On Tue, Dec 23, 2008 at 11:15 AM, Feiveson, Alan H. (JSC-SK311) <Alan.H.Feiveson@nasa.gov> wrote: > Jim - > > If I understand your question, you want to "fix" your randomly generated > X1, X2 and u so that their sample covariance matrix exactly equals the > one you want. Here's one way to do this (see below. > > Al Feiveson > > > . matrix V = I(3) > . matrix V[1,2]=.5 > . matrix V[2,1]=.5 > . matrix V[1,3]=.3 > . matrix V[3,1]=.3 > > . matrix list V > > symmetric V[3,3] > c1 c2 c3 > r1 1 > r2 .5 1 > r3 .3 0 1 > > . set obs 50 > obs was 0, now 50 > > . drawnorm X1 X2 u,cov(V) > > . corr X1 X2 u,cov > (obs=50) > > | X1 X2 u > -------------+--------------------------- > X1 | 1.01209 > X2 | .467516 .953286 > u | .571851 .073683 1.12376 > > . matrix accum A = X1 X2 u,dev noc > (obs=50) > > . matrix A=(1/49)*A > > . matrix list A > > symmetric A[3,3] > X1 X2 u > X1 1.0120898 > X2 .46751642 .95328608 > u .5718511 .07368264 1.123759 > > . matrix H=cholesky(V) > . matrix G=cholesky(A) > . matrix GI=inv(G) > > . matrix HGI=H*GI > > . matrix list HGI > > HGI[3,3] > X1 X2 u > r1 .99400935 0 0 > r2 .03111956 1.0085584 0 > r3 -.34919894 .07784363 1.082162 > > . des > > Contains data > obs: 50 > vars: 3 > size: 800 (99.9% of memory free) > ------------------------------------------------------------------------ > --------------------------- > storage display value > variable name type format label variable label > ------------------------------------------------------------------------ > --------------------------- > X1 float %9.0g > X2 float %9.0g > u float %9.0g > ------------------------------------------------------------------------ > --------------------------- > Sorted by: > Note: dataset has changed since last saved > > . gen y1=HGI[1,1]*X1 > . gen y2=HGI[2,1]*X1+HGI[2,2]*X2 > . gen y3=HGI[3,1]*X1+HGI[3,2]*X2+HGI[3,3]*u > > . corr y*,cov > (obs=50) > > | y1 y2 y3 > -------------+--------------------------- > y1 | 1 > y2 | .5 1 > y3 | .3 -9.1e-10 1 > > > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of James Shaw > Sent: Tuesday, December 23, 2008 10:55 AM > To: statalist@hsphsun2.harvard.edu > Subject: st: Simulating multilevel data in Stata > > Dear Statalist members, > > I want to perform a simulation to show the inconsistency of the OLS and > random effects estimators when one of the regressors is correlated with > the unit-specific error component. The specifics of the simulation are > as follows; > > Y[i,t] (the outcome to be modeled) = b0 + b1*X1[i,t] + b2*X2[i,t] + u[i] > + e[i,t] > > i = 1,...,500 indexes subjects > t = 1,...,3 indexes time (repeated observations on subjects) > > X1 and X2 are normally distributed random variables with arbitrary means > and variances u is a normally distributed subject-specific error > component with mean of 0 and arbitrary variance e is a normally > distributed random error component with mean of 0 and arbitrary variance > > corr(X1,X2) = 0.5 > corr(X1,u) = 0.3 > corr(X2,u) = 0.0 > corr(X1,e) = corr(X2,e) = corr(u,e) = 0.0 > > b0, b1, and b2 are parameters to be specified in the simulation > > > I have been unable to identify a method that will ensure that > corr(X1,u) equals the desired value. I tried the following method in > which u was generated separately from X1 and X2 and cholesky > decomposition was applied to generate transformations of the three > random variables that would exhibit the desired correlations. > However, this yielded a non-zero correlation between X2 and u. > > Method 1 > *** > drop _all > set obs 500 > gen n = _n > gen u=invnorm(uniform()) > expand 3 > sort n > gen n2 = _n > gen t= (n2 - (n-1)*3) > drawnorm x1 x2 e > sort n > mkmat x1 x2 u e, matrix(X) > mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1) mat X2 = > X*cholesky(c) > *** > > A method that yielded somewhat better results involved generating X1, > X2, u, and e with a pre-specified correlation matrix and then collapsing > u so that it varied by subject only. This provided the correct values > for corr(X1,X2) and corr(X2,u) but attenuated the correlation between X1 > and u. I presume that I could simply specify a higher value for > corr(X1,u) when generating the variables so that the desired value would > be achieved after u is collapsed. However, this would not be the most > elegant solution. > > Method 2 > *** > drop _all > set obs 500 > mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1) gen n = > _n expand 3 sort n gen n2 = _n gen t= (n2 - (n-1)*3) drawnorm x1 x2 u e, > corr(c) sort n by n: egen u2 = mean(u) > *** > > Any suggestions or references would be appreciated. > > Regards, > > Jim > > > James W. Shaw, Ph.D., Pharm.D., M.P.H. > Assistant Professor > Department of Pharmacy Administration > College of Pharmacy > University of Illinois at Chicago > 833 South Wood Street, M/C 871, Room 252 Chicago, IL 60612 > Tel.: 312-355-5666 > Fax: 312-996-0868 > Mobile Tel.: 215-852-3045 > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 252 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: RE: Simulating multilevel data in Stata***From:*"Feiveson, Alan H. (JSC-SK311)" <Alan.H.Feiveson@nasa.gov>

**References**:**st: Simulating multilevel data in Stata***From:*"James Shaw" <shawjw@gmail.com>

**st: RE: Simulating multilevel data in Stata***From:*"Feiveson, Alan H. (JSC-SK311)" <Alan.H.Feiveson@nasa.gov>

- Prev by Date:
**st: Confidence interval for a median with weighted, clustered data** - Next by Date:
**RE: st: RE: Simulating multilevel data in Stata** - Previous by thread:
**st: RE: Simulating multilevel data in Stata** - Next by thread:
**RE: st: RE: Simulating multilevel data in Stata** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |