[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Questions about svy commands

From	Ana Gabriela Guerrero Serdan <[email protected]>
To	[email protected]
Subject	Re: st: Questions about svy commands
Date	Sun, 10 Feb 2008 04:12:13 -0800 (PST)

Dear Steven, 

> 1. Are the existing weights appropriate for the
> children?  To answer  
> this I would need more information about the survey.
> How did children  
> get into the sample?  As part of selected
> households? 

Children got into the sample as part of selected
households. PSUs were selected with linear systematic
pps sampling. Stratification was done at the regional
level and for urban/rural areas. 

> >  or is the weight the same for all members of the
household?   
> Were the data post-stratified in any way? If there
> is just one weight  
> for all members of the household, then use that.


I have only one weight for all households (same for
all members in the same household). So, I will use
this. 
> 
> 2. Do you select just the children for an analysis
> data set, or do  
> you analyze the entire set and use the -subpop-
> option? 

I have selected children below 5 years old because my
analysis is only related to children <5. I have not
used subpop for my regressions. 

The second  
> approach is the only one which will provide entirely
> correct standard  
> errors, although often there will be little
> difference.  Austin  
> Nichols showed how to create a data set for use with
> the -subpop-  
> option that will be only a little larger than one
> containing only  
> children.  See:
> http://www.stata.es/statalist/archive/2007-11/ 
> msg00810.html .

Thanks, will take a look at Austins. Since my analysis
is at the individual level if I keep all the sample.
Then I will have more than 150,000 obs from which I
really need is around 15,000. But I guess that this
wont be a major problem if I increase the memory. 

Another issue is that I also run regressions for
different groups within my sub-sample of children. So
I have split children into 4 cohorts and in the areas
where they reside. So my regressions are a sub-sub of
the whole sample. This sounds confusing but the idea
is to use a diffs-diffs estimators. So I compare
cohorts in different areas. 

> 
> 3. Although -svy- does not work with -areg-, you can
> use -areg- with  
> a -weight- option and with the proper PSU as the
> cluster variable.   
> You will be unable to use the -strata- option, and
> this could  
> potentially lead to estimated standard errors that
> are larger than  
> the true ones. It will also artificially increase
> the degrees of  
> freedom for error.  You can get around these by
> adding dummy  
> variables for stratum into your model. If strata are
> defined by your  
> �province� variable, then you have effectively done
> that.

They are difined by province but also by urban/rural.
So I would need to include dummies for urban areas as
well, I suppose? 

I have not thought about using pweight with areg. I
assume I could also use xi:regress y x 
i.dummiesprovince i.cohortbirth urban  [pweight=z] 


> 4. If there are too many strata to add as dummies
> (and strata are not  
> defined by your provinces), ignore the strata in the
> analysis, but  
> adjust the degrees of freedom by hand. The proper
> degrees of freedom  
> for error will be the listed d.f. minus the number
> of strata. You can  
> compute correct confidence intervals, say 95%
> intervals, as follows:
> 
> 4.1. Find the error degrees of freedom from the
> -areg- output WITH  
> the the -cluster- option.  Suppose it is, df1 = 180.
>  If you had 80  
> strata, the degrees of freedom should be df2 =180 -
> 80 = 100.
> 
> 4.2. With 180 degrees of freedeom, the t-multiplier
> for a standard  
> error would be 1.973, but this is too small. Compute
> the t-multiplier  
> for the correct degrees of freedom  and 95% CI as 
> invttail 
> (100,.025), or 1.9840.
> 
> 4.3. You should INCREASE the nominal confidence
> level for -areg-, so  
> that the t-multiplier with 180 d.f. is 1.9840.  What
> should the level  
> be?  First find:  ttail(180,1.9840), or 0.02439. 
> The proper -level-  
> is then: 1- 2x.02439=0.951.  So you should specify a
> -level-  
> statement as �set level 95.12�.
> 
> You can find the proper level in one step by:
> 
>   di 1-2*ttail(df1 , invttail(df2,.025))
> //finds level where df1 is the nominal degrees of
> freedom and df2 is  
> the actual degrees of freedom =df1- n. strata.

This is very useful to know. I need to study this
closer. 


Thanks for your very explicit and clear answers. 

rgds, 

Gaby 


> -Steven
> 
> On Feb 9, 2008, at 7:04 AM, Ana Gabriela Guerrero
> Serdan wrote:
> 
> > Dear all,
> >
> > Sorry for these probably obvious questions. Have
> > looked into the archives  but I'm still confused
> on
> > the following issues:
> >
> > 1) I am using survey data (two-stages with
> > stratification). I am looking at children less
> than
> > five years old.  Can I apply svy set as usual to
> my
> > sub-sample of children as follows?
> >
> > svyset [pweight= expweigh],  strata(AI05) psu(
> AI06)
> >
> >
> > 2) I had initially done my analyis with linear
> > ressions without the svyset, controlling for
> > differences in provinces and cohorts, and
> clustering
> > at the district level. I used areg as follows:
> >
> > areg Y X DummiesProvinces, vce(cluster district)
> > absorb(mdate)
> >
> > What command can I use if I first set my data for
> > svyset?
> >
> >
> > Gaby Guerrero Serdan
> >
> > Deparment of Economics
> > Royal Holloway, University of London
> > TW20 OEX
> > Egham, Surrey
> > England, UK
> >
>
http://www.rhul.ac.uk/economics/About-Us/postgrads.html
> > http://www.flickr.com/photos/49939890@N00/show/
> >
> > Tel: +44 7912657259
> >
> >
> >        
> >
>
______________________________________________________________________
> 
> > ______________
> > Be a better friend, newshound, and
> > know-it-all with Yahoo! Mobile.  Try it now. 
> http:// 
> >
> mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> >
> > *
> > *   For searches and help try:
> > *  
> http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *  
> http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 



Gaby Guerrero Serdan 

Deparment of Economics
Royal Holloway, University of London
TW20 OEX
Egham, Surrey
England, UK
http://www.rhul.ac.uk/economics/About-Us/postgrads.html
http://www.flickr.com/photos/49939890@N00/show/

Tel: +44 7912657259


      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Questions about svy commands
  - From: Steven Samuels <[email protected]>

References:
- Re: st: Questions about svy commands
  - From: Steven Samuels <[email protected]>

Prev by Date: Re: st: Running the same do file but getting different results
Next by Date: Re: st: Questions about svy commands
Previous by thread: Re: st: Questions about svy commands
Next by thread: Re: st: Questions about svy commands
Index(es):
- Date
- Thread