From
"Stas Kolenikov" <skolenik@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: svy: tabulate or proportions

Date
Tue, 29 Jul 2008 11:36:49 -0500

On 7/25/08, Chao Yawo <Yawo1964@yahoo.com> wrote: > I am using svy commands to analyze a DHS dataset. > > As a usual prerequisite, I want to run some descriptive statistics on > my sample. I can use the regular tabulate or fre command to produce > frequency distributions. > > However, I realized that svy has a "tabulate" or "proportions" option > that could produce frequency distributions/estimates per variable. I > run both and realized slight differences between the two frequencies > outputs. > > Which one should I use - I am leaning towards using the one with the > svy: prefix. > > I would appreciate any thoughts and pointers. Well as Steven said, what is it exactly that you want to figure out? If you want to see whether you have cells with zero or low counts, then either -tab- or -svy : tab- will do. If you want to get any idea of the underlying population, you MUST use -svy-. Let's think through a grocery shopping example. Suppose somebody looked at your fridge and counted how many gallons of milk you have there, how many eggs, the total weight of vegetables, etc. If they want to figure out a diet of a given person, then that's all the data they need. If they wanted to figure out what's available in your grocery store, or what's a diet of an average person, then there is more work to do: they need to figure out how often you buy any particular food. May be you are a vegetarian, and skip the meat rows in your supermarket -- so your fridge will not provide any information about meat consumption, and estimates of protein intake based on your fridge only will be biased. The "how frequently" question is what you also know as sampling weights, based on inverse probabilities of selection. So if you want something that's specific to your sample, you can have a go without -svy- options. Will that be interesting to anybody? Probably not. Whichever summaries you want to produce out of your data will only be interesting to the extent that they describe the population -- and then you need to use the survey design information. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

