Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"George_Huang" <[email protected]> |

To |
<[email protected]> |

Subject |
st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure? |

Date |
Fri, 28 Jun 2013 17:20:31 +0800 |

Dear Steve,

Many thinks, George

Sent: Friday, June 28, 2013 5:34 AM To: [email protected]

I highly recommend the very robust mmregress package, by Verardi and Croux (net describe st0173_1,(http://www.stata-journal.com/software/sj10-2)) as the best, indeed, the only way in Stata to reliably identify outliers and high leverage points and to simultaneously fit models that down-weight or eliminate the influence of such points. Neither -qreg- nor -rreg- can downweight or identify high leverage points. Note that diagnostics based on OLS, including studentized residuals, are very sensitive to outliers. They consider changes related to the deletion of one observation at a time. Extreme points pull the fitted regression surface towards themselves. If there are two outlying/high-leverage observations in the same location, each will "mask" the other. -mmregress- is not subject to such masking.

References: Verardi, V., and C. Croux. 2009. Robust regression in Stata. Stata Journal 9, no. 3: 439-453.

Steve [email protected] On Jun 27, 2013, at 10:36 AM, George_Huang wrote: Dear David,

Thanks and Best, George -----原始郵件----- From: David Hoaglin Sent: Thursday, June 27, 2013 8:31 PM To: [email protected]

Dear George, Assessing "the robustness of the analysis results" usually involves much more than rerunning the model after removing observations that the model does not fit well. Your coauthor should explain the justification for removing those "outliers." Whenever possible, one should investigate observations that have large residuals. The definition of "studentized residual" is important here. Much of the literature on regression diagnostics defines the studentized residual for observation i as the difference between the observed value of y for observation i and the value of y predicted for observation i by the regression model without observation i, divided by a suitable estimate of the standard deviation of that difference. Some people use the term "jackknife residual." The reasoning is that an observation that is influential may not have a large residual, because it has distorted the fit. Sometimes two or more observations are jointly influential, so that their individual studentized residuals are not large. If one can detect such behavior (not always an easy task), one then removed the whole group of observations (and tries to understand what is responsible for their behavior). All this is part of careful analysis; nothing is automatic. Earlier you mentioned -reg-, from which you can get the information you need (in postestimation). I have seldom used -areg-, but I am not surprised that it does not give the same detailed information about individual observations. It appears that you have some type of panel data, so the diagnostic process may be more complicated. You may want to tell us more about your data. I hope this discussion helps. David Hoaglin On Thu, Jun 27, 2013 at 2:27 AM, George_Huang <[email protected]> wrote:

Dear David and Peter,Thanks for both of your suggestions. I want to delete studentizedresidualsthat have an absolute value greater than or equal to two to deleteoutliersbecause I want to test the robustness of the analysis results. This is suggested by my coauthor. However, I am more comfortable for deleting the outliers by 3 absolute value of studentized residuals as you mentioned. I can not find postestimation for studentized residuals after conducing areg procedure. If you have further suggesitons, please let me know. Thanks a lot, George

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*"George_Huang" <[email protected]>

**Re: st: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*David Hoaglin <[email protected]>

**RE: st: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*"Lachenbruch, Peter" <[email protected]>

**st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*"George_Huang" <[email protected]>

**Re: st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*David Hoaglin <[email protected]>

**st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*"George_Huang" <[email protected]>

**Re: st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?***From:*Steve Samuels <[email protected]>

- Prev by Date:
**RE: st: Using AIPW for missing data purposes in RCTs?** - Next by Date:
**st: graph combine of tabplot graphs: uncomparable bar heights** - Previous by thread:
**Re: st: Re: How to delete studentized residuals with absolute values greater than or equal to two after conducting areg procedure?** - Next by thread:
**st: switching regression when the outcome variable is binary** - Index(es):