Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: diagnostic plots with svy:reg

From   Steve Samuels <>
Subject   Re: st: diagnostic plots with svy:reg
Date   Tue, 24 Apr 2012 20:47:27 -0400

You misunderstand the documentation's point: it's not that diagnostics are
unimportant for survey regression, but that the standard measures are based on formulas that don't apply to weighted, clustered data.  

Prof. Richard Valliant and his students have done a lot of work in this area. See

1.  Stata Conference presentation:




In correspondence with a student last year, Professor Valliant wrote:

"The standard packages will compute certain things that are still informative,
 although not exactly right:

Cook's D, DFBETAS, DFFITS from wtd least squares (WLS): these are off because
the wrong variance estimates are used. But if you have any reallly extreme
points, the standard diegnostics should identify them anyway.

Leverages from WLS: these are ok from the standard pkgs.

Collinearity diagnostics: VIFs from WLS can be too large or small, but if you
have extreme collinearity between two x's in a model, the standard VIFs should
tell you that. Condition indexes and variance decompositions for collinearity:
These are probably pretty close to right from the standard WLS output. 
These allow you to diagnose which x's"

I would supplement these suggestions: 

1. Run -mmregress- (SSC), which does not take weights or clusters, but is otherwise excellent at identifying outliers _and_ high leverage observations that would otherwise mask one another.


2. Run -qreg- with aweights to identify outliers better than the standard leave-one- out standard algorithms.


On Apr 24, 2012, at 6:28 PM, Lee Grenon wrote:

Hello, in the Stata documentation on regression postestimation, it states that diagnostic plots such as rvpplots, rvfplots, dfbetas, crpplots, etc are not appropriate with the svy: prefix. I am interested in understanding why these diagnostic plots are not appropriate when using the design-based regression procedure. Can anyone explain this to me/

Does anyone have a suggestion for producing appropriate diagnostic plots for design-based regression. I am working with population survey data which has bootstrap replicate weights provided.
Have a good day
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index