[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: RE: Re: errors in outcome variables regression

From	"Mike Hollis" <[email protected]>
To	<[email protected]>
Subject	RE: st: RE: Re: errors in outcome variables regression
Date	Sat, 5 Jul 2003 11:28:59 -0700
Assuming "well behaved" measurement error in the dependent variable, the OLS
regression coefficient(s) will be unbiased.  But the standard errors for the
coefficients (and other statistics involving the variance of the dependent
variable) will be wrong.

Consider an extension of Mark's model where we add a single explanatory
variable and his error term distinguishing between the "true" and
"measurement" components of the error term:

  y = bo + b1X1 + u + u_m.

Under standard assuptions, the residual variance of y is then V(u + u_m) =
V(u) + V(u_m).  If y is measured without error, the standard error of b1
would be sqrt( V(u) / SSx ), where u_m=0 and SSx is the usual mean-corrected
sum of squares for X.  With measurement error, the standard error for b1
would be sqrt ( V(u) + V (u_m) / SSx).  Accordingly, in this case,
measurement error causes the true standard error of b1 to be overstated by 1
+ v(u-m)/v(m).  Other statistics involving either the variance of y or the
residual variance of y|x (e.g., simple correlation coefficient, R**2 for the
equation, standardized regression coefficients) will likewise be incorrect.

-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Mark Schaffer
Sent: Saturday, July 05, 2003 9:26 AM
To: [email protected]; Mike Hollis
Subject: Re: st: RE: Re: errors in outcome variables regression


Mike et al.,

Quoting Mike Hollis <[email protected]>:

> Measurement error in the endogeneous variable will, however, cause
> the
> residual variance for the equation to be overstated, meaning, in
> general,
> that the standard errors for the regression coefficients will be too
> large
> and the estimated t- and F-statistics will be too small.

Scott re-replied to Margaret's original post, so I'll re-reply to Mike's.

I'm pretty sure Mike's point above isn't correct.   So long as the
measurement error satisfies the usual distributional assumptions that make
OLS kosher (homoskedasticity, orthogonality etc.), and so long as the
regressions error (the "non-measurement-error" error) also satisfies these
assumptions, then OLS is fine.

Intuitively, the reason is the following.  Say the measurement error is u_m
and the regression error is u.  Define a new combined error term
u_c = u + u_m.  Now rewrite the regression equation with this single
combined error term.  It's not hard to see that so long as u_c satisfies
the usual distributional assumptions (and it should if both u and u_m do
so) then OLS is fine.

For more details, see Scott's cite of Greene.

That said, there will often be times that measurement error in the
endogenous variable will not satisfy the usual assumptions and OLS will not
be kosher.  In particular, if the measurement error is heteroskedastic,
then the SEs and the F-stat will not be consistent.  But this is a
heteroskedasticity problem, not a measurement error problem per se.

Hope this helps.

--Mark


>
> If you have a estimate of the reliability of the outcome variable,
> you could
> conceivable use this to adjust the standard errors and associated
> statistics, although the quality of this adjustment obviously
> depends on the
> quality of your reliability estimate.  (Note, however, that the
> intra-class
> correlation coefficient is a measure of non-independence.
> Correcting for
> measurement error in your case requires something like Chronbach's
> alpha or,
> if you're lucky enough to have them, multiple indicators for the
> outcome
> variable.  See Ken Bollen's _Structural Equations with Latent
> Variables_ for
> a discussion of different strategies.)
>
> If the regression coefficients in your current model are
> statitically
> significant (i.e., you're not in a situation where you're trying to
> correct
> for measurement error to reduce standard errors in an attempt to
> cause
> statistically non-significant to become significant), you might
> simply note
> the fact that you suspect your outcome variable is affected by
> measurement
> error and that this will cause the significance level of the
> regression
> coefficients in your model to be underestimated.
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Scott
> Merryman
> Sent: Friday, July 04, 2003 5:58 AM
> To: [email protected]
> Subject: st: Re: errors in outcome variables regression
>
>
> ----- Original Message -----
> From: "Margaret May" <[email protected]>
> To: <[email protected]>
> Sent: Friday, July 04, 2003 5:32 AM
> Subject: st: errors in outcome variables regression
>
>
> > I have been looking at the command eivreg (errors in variables
> regression)
> > which corrects the effect estimate when independent variables are
> measured
> > with error. The problem I have is looking at differences in a
> continuous
> > outcome between exposure groups where the outcome variable is
> measured
> with
> > error. I can estimate the reliability of the outcome measure as I
> have
> data
> > from a validity study so can estimate the intra-class
> correlation
> > coefficient. Is there a method for correcting for measurement
> error in
> > outcome variables?
> >
> > Margaret May
> >
>
>
> A question concerning errors in the dependent variable came up on
> March 6th
> by Charlie Trevor with replies by myself and Mark Schaffer on March
> 6th and
> 7th.
>
> My reply was:
>
> Is this necessary?
> >From Greene (4th ed. page 376):
> "...assuming for the moment that only y* is measured with error...
> this
> result conforms completely to the assumption of the classical
> regression
> model.  As long as the regressor is measured properly, measurement
> error on
> the dependent variable can be absorbed in the disturbance of the
> regression
> and ignored."
>
> Hope this helps,
> Scott
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: [email protected]
web: http://www.sml.hw.ac.uk/ecomes
________________________________________________________________

DISCLAIMER:

This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
________________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- RE: st: RE: Re: errors in outcome variables regression
  - From: Mark Schaffer <[email protected]>
References:
- Re: st: RE: Re: errors in outcome variables regression
  - From: Mark Schaffer <[email protected]>
Prev by Date: st: Re: Question about time effects
Next by Date: st:
Previous by thread: Re: st: RE: Re: errors in outcome variables regression
Next by thread: RE: st: RE: Re: errors in outcome variables regression
Index(es):
- Date
- Thread