Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: A question about modelling heterogenous variances


From   Timothy.Mak@iop.kcl.ac.uk
To   statalist@hsphsun2.harvard.edu
Subject   st: A question about modelling heterogenous variances
Date   Fri, 21 Oct 2005 14:51:45 +0100

Hi, 

Consider the following problem: I want to regress age on sex, but my 
dataset was collected from four sites, so I'd like to control for site. I 
could do: 

xi: reg age i.sex i.site 

But graphical examination suggests that different sites had different 
variances. What's the solution? 

I've done some research already, and it seems that if I use either -vwls- 
with the sd option, or -reg- with the aweight option, I'll be able to get 
round this to a certain extent. The problem is I'll first need to estimate 
the variance in another way, most likely by obtaining the residuals from 
an OLS regression first. 

Besides the rather long-winded way of this approach, theoretically the 
estimates won't be optimal because the variances estimates are not based 
on the weighted regression. But still, if this is the best way to go about 
the problem, I'll probably use it. One question is: Can the use of aweight 
be readily extended to more complicated models such as -glm- or -xtgee- to 
account for heterogeneity in variances? If so, how? 

One of the great features of Stata is its robust option in many estimation 
commands. Theoretically in normal linear regression, it replaces the 
variance matrix of our error (e) with an empirical one based on the 
residuals. I foresee that one solution to my problem would be to create a 
variance matrix that is half way between the OLS and this empirical one, 
that is one that has its residuals averaged within each group (site). One 
problem with the robust option is that often if my subgroup size is too 
small, it gives rubbish estimates of Standard error. I wonder if this 
could be a solution to this too. Has anyone done methodological 
investigation into this technique? 

Yours,
Tim Mak

PS this query has been posted in allstat
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index