# Re: st: svy: tabulate or proportions

 From "Stas Kolenikov" To statalist@hsphsun2.harvard.edu Subject Re: st: svy: tabulate or proportions Date Tue, 29 Jul 2008 11:36:49 -0500

```On 7/25/08, Chao Yawo <Yawo1964@yahoo.com> wrote:
>  I am using svy commands to analyze a DHS dataset.
>
>  As a usual prerequisite, I want to run some descriptive statistics on
>  my sample.  I can use the regular tabulate or fre command to produce
>  frequency distributions.
>
>  However, I realized that svy has a "tabulate" or "proportions" option
>  that could produce frequency distributions/estimates per variable.  I
>  run both and realized slight differences between the two frequencies
>  outputs.
>
>  Which one should I use - I am leaning towards using the one with the
>  svy: prefix.
>
>  I would appreciate any thoughts and pointers.

Well as Steven said, what is it exactly that you want to figure out?
If you want to see whether you have cells with zero or low counts,
then either -tab- or -svy : tab- will do. If you want to get any idea
of the underlying population, you MUST use -svy-.

Let's think through a grocery shopping example. Suppose somebody
looked at your fridge and counted how many gallons of milk you have
there, how many eggs, the total weight of vegetables, etc. If they
want to figure out a diet of a given person, then that's all the data
they need. If they wanted to figure out what's available in your
grocery store, or what's a diet of an average person, then there is
more work to do: they need to figure out how often you buy any
particular food. May be you are a vegetarian, and skip the meat rows
in your supermarket -- so your fridge will not provide any information
about meat consumption, and estimates of protein intake based on your
fridge only will be biased. The "how frequently" question is what you
also know as sampling weights, based on inverse probabilities of
selection.

So if you want something that's specific to your sample, you can have
a go without -svy- options. Will that be interesting to anybody?
Probably not. Whichever summaries you want to produce out of your data
will only be interesting to the extent that they describe the
population -- and then you need to use the survey design information.

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```