[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: svy: tabulate or proportions

From   "Chao Yawo" <>
Subject   Re: st: svy: tabulate or proportions
Date   Tue, 29 Jul 2008 17:54:02 -0400

Thanks very much.  I probably wasn't too clear in my original post

I am examining the relationship between HIV testing behaviors (whether
people plan to take the test, etc) and a number of socio-demographic
and epidemiological variables.  At a minimum, I wanted to understand
the distribution of all my predictors before running crostabs

I know svy: tab can produce both the estimated proportions (just like
the %s displayed in frequency tables) and the crosstab  estimates. And
both of these allow one to make assumptions about the underlying

Hence the question -- whether it is appropriate to use the svy:
estimation commands once I am dealing with such a survey sample... or
to revert to non-survey commands.

thanks - CY

On Tue, Jul 29, 2008 at 12:36 PM, Stas Kolenikov <> wrote:
> On 7/25/08, Chao Yawo <> wrote:
>>  I am using svy commands to analyze a DHS dataset.
>>  As a usual prerequisite, I want to run some descriptive statistics on
>>  my sample.  I can use the regular tabulate or fre commainnd to produce
>>  frequency distributions.
>>  However, I realized that svy has a "tabulate" or "proportions" option
>>  that could produce frequency distributions/estimates per variable.  I
>>  run both and realized slight differences between the two frequencies
>>  outputs.
>>  Which one should I use - I am leaning towards using the one with the
>>  svy: prefix.
>>  I would appreciate any thoughts and pointers.
> Well as Steven said, what is it exactly that you want to figure out?
> If you want to see whether you have cells with zero or low counts,
> then either -tab- or -svy : tab- will do. If you want to get any idea
> of the underlying population, you MUST use -svy-.
> Let's think through a grocery shopping example. Suppose somebody
> looked at your fridge and counted how many gallons of milk you have
> there, how many eggs, the total weight of vegetables, etc. If they
> want to figure out a diet of a given person, then that's all the data
> they need. If they wanted to figure out what's available in your
> grocery store, or what's a diet of an average person, then there is
> more work to do: they need to figure out how often you buy any
> particular food. May be you are a vegetarian, and skip the meat rows
> in your supermarket -- so your fridge will not provide any information
> about meat consumption, and estimates of protein intake based on your
> fridge only will be biased. The "how frequently" question is what you
> also know as sampling weights, based on inverse probabilities of
> selection.
> So if you want something that's specific to your sample, you can have
> a go without -svy- options. Will that be interesting to anybody?
> Probably not. Whichever summaries you want to produce out of your data
> will only be interesting to the extent that they describe the
> population -- and then you need to use the survey design information.
> --
> Stas Kolenikov, also found at
> Small print: I use this email account for mailing lists only.
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index