Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: calculrating confidence Intervals in svyprop statements


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: calculrating confidence Intervals in svyprop statements
Date   Fri, 27 Aug 2004 10:01:34 -0500

Hongsoo <hk489@nyu.edu> asks about producing confidence intervals for
categories of some category variables:

> By following Stas' suggestion, I checked the common_options and tried it.
> Unfortunately, svyprop doesn't allow "ci " as a common_option. Below is the
> caution message popped up on the STATA Results window.
> 
> svyprop  cat1Rr cat2Rr cat3Rr,  ci
> option ci not allowed
> r(198);
> 
> Is there anyone who has any other suggestion? Thanks for considering of it.
> FYI, the all three combined variables -cat1Rr, catRr, and cat3Rr- are binary
> variables(0,1).

You can get normal based confidence intervals already by using -svymean-.
Here is a simple example:

	. sysuse auto, clear
	. svyset, srs
	. svymean foreign

Here are the results from -svymean- above:

***** BEGIN:
Survey mean estimation
 
pweight:  <none>                                  Number of obs    =        74
Strata:   <one>                                   Number of strata =         1
PSU:      <observations>                          Number of PSUs   =        74
                                                  Population size  =        74
 
------------------------------------------------------------------------------
    Mean |   Estimate    Std. Err.   [95% Conf. Interval]        Deff
---------+--------------------------------------------------------------------
 foreign |   .2972973    .0534958    .1906803    .4039143           1
------------------------------------------------------------------------------
***** END:

-foreign- is an indicator variable for foreign made cars.  However, if you
wanted a point and interval estimate for each category, first generate an
indicator variable for each using -tabulate, generate()-:

	. sysuse auto, clear
	. svyset, srs
	. tabulate rep78, generate(repcat)
	. svymean repcat*

Here are the results from -svymean- above:

***** BEGIN:
. svymean repcat*
 
Survey mean estimation
 
pweight:  <none>                                  Number of obs(*) =        74
Strata:   <one>                                   Number of strata =         1
PSU:      <observations>                          Number of PSUs   =        74
                                                  Population size  =        74
 
------------------------------------------------------------------------------
    Mean |   Estimate    Std. Err.   [95% Conf. Interval]        Deff
---------+--------------------------------------------------------------------
 repcat1 |   .0289855    .0203446   -.0116115    .0695825           1
 repcat2 |    .115942    .0388245    .0384689    .1934152           1
 repcat3 |   .4347826    .0601159    .3148232     .554742           1
 repcat4 |   .2608696    .0532498    .1546113    .3671278           1
 repcat5 |   .1594203    .0443922     .070837    .2480036           1
------------------------------------------------------------------------------
(*) Some variables contain missing values.
***** END:

Note that you need to look at the label of the newly generated variable to
determine which category the new variable -tabulate- generated belongs to.  In
the above case the assignment is pretty straight forward: repcat1 identifies
rep78 == 1, ..., and repcat5 identifies rep78==5.

The Ci for -repcat1- is not entirely contained in [0,1].  In a previous email
to Statalist, "Nichols, Austin" <ANichols@ui.urban.org> indicated using the
inverse logit transform of the confidence interval limits from -svylogit-.
Here is an example of how this can be done:

	. sysuse auto, clear
	. svyset, srs
	. tabulate rep78, generate(repcat)
	. svylogit repcat1
	. scalar lcb = invlogit(_b[_cons]-invttail(e(df_r),.025)*_se[_cons])
	. scalar ucb = invlogit(_b[_cons]+invttail(e(df_r),.025)*_se[_cons])
	. di "CI for repcat1 is (" scalar(lcb) ", " scalar(ucb) ")"

Here are the results from above:

***** BEGIN:
. svylogit repcat1
 
Survey logistic regression
 
pweight:  <none>                                  Number of obs    =        69
Strata:   <one>                                   Number of strata =         1
PSU:      <observations>                          Number of PSUs   =        69
                                                  Population size  =        69
                                                  F(   0,     69)  =         .
                                                  Prob > F         =         .
 
------------------------------------------------------------------------------
     repcat1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |  -3.511545   .7228401    -4.86   0.000     -4.95395   -2.069141
------------------------------------------------------------------------------
 
. scalar lcb = invlogit(_b[_cons]-invttail(e(df_r),.025)*_se[_cons])
 
. scalar ucb = invlogit(_b[_cons]+invttail(e(df_r),.025)*_se[_cons])
 
. di "CI for repcat1 is (" scalar(lcb) ", " scalar(ucb) ")"
CI for repcat1 is (.00700605, .11213258)
***** END:

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index