Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: How to test for equality of variance in data with sampling weights


From   jmetzler@worldbank.org
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: How to test for equality of variance in data with sampling weights
Date   Wed, 30 May 2007 19:01:01 -0400

Dear Steven and Stas,
thank you very much for your extremely helpful advice - I will get back to you
once I have tried!
Kind regards,
Johannes





                                                                                
             Steven Samuels                                                     
             <ssamuels@alban                                                    
             y.edu>                                                          To 
             Sent by:                statalist@hsphsun2.harvard.edu             
             owner-statalist                                                 cc 
             @hsphsun2.harva                                                    
             rd.edu                                                     Subject 
                                     Re: st: How to test for equality of        
                                     variance in data with sampling weights     
             05/25/2007                                                         
             06:10 PM                                                           
                                                                                
                                                                                
             Please respond                                                     
                   to                                                           
             statalist@hsphs                                                    
             un2.harvard.edu                                                    
                                                                                
                                                                                




Yohannes wrote to me privately that he has a household indicator, say
hh_id; that household is the only PSU he can identify in the data
set; and that in a single urban setting, there is no stratum
variable. In that case he would set up his analysis with:
  "svyset hh_id [pweight=finalwgt]"

As Stas indicated in HIS recent email, one can compute a SD as a
function of expectations and use either -testnl- or (my choice) -
nlcom-   because:

  SD(income)= square root of E(income^2)-(E(income))^2

However  -nlcom- will produce an erroneous standard error unless the
variable is standardized by subtracting off the mean: inc= inc - mean
(inc) and then squaring : inc2= inc*inc  Then E(inc2) = var(income)
and sqrt(E(inc2)) estimates the SD of income

As the SD is apt to have an asymmetric distribution, I suggest that
the Johannes estimate the SE for the log(SD) and then convert back to
the SD scale.

Johannes actually wants to compare SD's in two groups, assumed to be
Male & Female gender here for illustration.   In that case, I
recommend that he compute a CI for the ratio, rather then for the
difference, and that he do this on the log scale and then convert
back to the ratio scale.



Below is code that should work.  This utilizes the linearization
method.  Possibly Johannes might wish to try a jackknife estimate of
the variance-covariance matrix.
Steve

/***************************CODE
FOLLOWS*********************************************/

capture program drop _all

/* First a little program to back transform calculations done on the
log scale after -nlcom- */
program antilog
local lparm  el(r(b),1,1)
local se     sqrt(el(r(V),1,1))
local bound  invttail(e(df_r),.025)*`se'  //For 95% CI's
local parm   exp(`lparm')
local ll     exp(`lparm'  - `bound')
local ul     exp( `lparm' + `bound')
di  "parm =" `parm'  "    ll = " `ll'  "   ul = " `ul'
end

/* Get Estimate of the Mean for each Group */

svy: mean income, over(gender)

/* If gender has value labels (e.g. 1=Male 2=Female) use the
following syntax */
gen     inc=income-[income]Male   if gender==1
replace inc=income-[imcome]Female if gender==2

/* Use this syntax if gender has no value label, but values 1 & 2 as
above */
gen     inc=income-[income]1 if gender==1
replace inc=income-[imcome]2 if gender==2

/* Now compute the square term */

gen inc2=inc*inc

svymean: inc2, over(gender)   //estimate for inc2 is the estimated
Variance of income

/* Individual SD's. Log Scale */
nlcom  .5*log([inc2]Male)
antilog
nlcom  .5*log([inc2]Female)
antilog

/* CI for the ratio of SD's--No Log */
nlcom sqrt([inc2]Male/[inc2]Female)

/* CI for ratio of SD's after Log Transformation. The square root can
be omitted, because log(A^.5)-log(B^.5) = log(A)-log(B)
The t-statistic is apt to be very different from that of the no-log
version above*/

nlcom log([inc2]Male/[inc2]Female)
antilog




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index