[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: svy: tabulate or proportions

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: svy: tabulate or proportions
Date	Wed, 30 Jul 2008 16:45:30 -0500

Ouch!  I meant "the responses that Stas, Sergiy, and I wrote"!  The  
last time I so mixed up names, I called Nick Cox, "Professor Cross",  
after an old colleague.  Apologies to all concerned.

-Steve


Chao,  As far as I can tell, you haven't asked a different question  
here. If you reread the responses that Austin and I wrote, I'm sure  
you will find your question answered.

-Steven
On Jul 29, 2008, at 4:54 PM, Chao Yawo wrote:

> Thanks very much.  I probably wasn't too clear in my original post
>
> I am examining the relationship between HIV testing behaviors (whether
> people plan to take the test, etc) and a number of socio-demographic
> and epidemiological variables.  At a minimum, I wanted to understand
> the distribution of all my predictors before running crostabs
> procedures.
>
> I know svy: tab can produce both the estimated proportions (just like
> the %s displayed in frequency tables) and the crosstab  estimates. And
> both of these allow one to make assumptions about the underlying
> population.
>
> Hence the question -- whether it is appropriate to use the svy:
> estimation commands once I am dealing with such a survey sample... or
> to revert to non-survey commands.
>
> thanks - CY
>
>
> ----------------------
> On Tue, Jul 29, 2008 at 12:36 PM, Stas Kolenikov  
> <[email protected]> wrote:
>> On 7/25/08, Chao Yawo <[email protected]> wrote:
>>>  I am using svy commands to analyze a DHS dataset.
>>>
>>>  As a usual prerequisite, I want to run some descriptive  
>>> statistics on
>>>  my sample.  I can use the regular tabulate or fre commainnd to  
>>> produce
>>>  frequency distributions.
>>>
>>>  However, I realized that svy has a "tabulate" or "proportions"  
>>> option
>>>  that could produce frequency distributions/estimates per  
>>> variable.  I
>>>  run both and realized slight differences between the two  
>>> frequencies
>>>  outputs.
>>>
>>>  Which one should I use - I am leaning towards using the one with  
>>> the
>>>  svy: prefix.
>>>
>>>  I would appreciate any thoughts and pointers.
>>
>> Well as Steven said, what is it exactly that you want to figure out?
>> If you want to see whether you have cells with zero or low counts,
>> then either -tab- or -svy : tab- will do. If you want to get any idea
>> of the underlying population, you MUST use -svy-.
>>
>> Let's think through a grocery shopping example. Suppose somebody
>> looked at your fridge and counted how many gallons of milk you have
>> there, how many eggs, the total weight of vegetables, etc. If they
>> want to figure out a diet of a given person, then that's all the data
>> they need. If they wanted to figure out what's available in your
>> grocery store, or what's a diet of an average person, then there is
>> more work to do: they need to figure out how often you buy any
>> particular food. May be you are a vegetarian, and skip the meat rows
>> in your supermarket -- so your fridge will not provide any  
>> information
>> about meat consumption, and estimates of protein intake based on your
>> fridge only will be biased. The "how frequently" question is what you
>> also know as sampling weights, based on inverse probabilities of
>> selection.
>>
>> So if you want something that's specific to your sample, you can have
>> a go without -svy- options. Will that be interesting to anybody?
>> Probably not. Whichever summaries you want to produce out of your  
>> data
>> will only be interesting to the extent that they describe the
>> population -- and then you need to use the survey design information.
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
>> Small print: I use this email account for mailing lists only.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: RE: st: RE: Dialog Programming
Next by Date: st: Implementing Censored Quantile Regression
Previous by thread: Re: st: svy: tabulate or proportions
Next by thread: st: two-way error component regression
Index(es):
- Date
- Thread