From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Weights |

Date |
Wed, 30 Apr 2008 11:26:50 -0400 |

Martin Weiss <martin.weiss@uni-tuebingen.de> SPSS is using the wrong type of weight, and therefore will give you incorrect standard errors. See -help weights- and -help svy- and the manuals for more. Perhaps the large size of the Stata file is due to all variables being stored as doubles? Try -compress- on an extract and see -help datatypes-. Note that -mean- restricts to obs where all vars are nonmissing, so instead of e.g. ds, has(type numeric) loc num `r(varlist)' mean `num' try ds, has(type numeric) loc num `r(varlist)' foreach v of loc num { mean `v' } or just use -summarize- with aweights or pweights (pweights=aweights+_robust so point estimates are identical, but variance estimates differ). On Wed, Apr 30, 2008 at 10:57 AM, Martin Weiss <martin.weiss@uni-tuebingen.de> wrote: > Dear Statalisters, > > can anybody give me a clue as to the array of weighting options in Stata? I > have an important project where I would really like to make headway... > > My dataset features a size of 2.4 GB as .csv. When I translate this into > SPSS, it ends up with 2.7 GB while the equivalent Stata dataset has 5.5 GB > (!). Anyway, I usually pick out the interesting variables beforehand because > Stata is unable to open the entire dataset. The first column of the data > contains samplingweights. The dataprovider ships a pdf with the descriptives > for the marginal distributions of the variables in the population so I know > the true values. > > Now here lies the rub: when I weight -summarize- with analytic weights, the > approximately correct mean and standard deviation pop out. When I let Stata > estimate the mean with the -mean- command, with analytic weights attached in > the same fashion, I get widely differing results for the point estimate of > the mean, far from the true values. In SPSS, I simply go to -weight cases- > and everything comes out correct. > > Do I have to -svyset- the data? When I try to -frequency weight- the data, > Stata complains that non-integers are not allowed while SPSS seems to not > quarrel with them. Why is it that SPSS needs one command at the beginning of > the session while Stata has a (differing) tab dedicated to weighting for > every single command? > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

