Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: survey data - different results in version 8 and version 9?


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: survey data - different results in version 8 and version 9?
Date   Thu, 10 May 2007 11:07:03 -0500

Elke LUEDEMANN <LUEDEMANN@ifo.de> wrote that, for a given dataset,
-svy: regress- is reporting missing standard errors (SEs) with a warning
message about the variance matrix while -svyreg- is reporting non-missing SEs:

> I am using Stata version 9.2 (just updated it today!) and I am
> re-running some analysis that I previously used in Stata 8.
> 
> The following command using version 8 code works, i.e. produces output
> containing both regression coefficient estimates and standard errors:
> . version 8
> . svyset [pweight=newwgt], psu(psuid) strata(stratum)
> . svyreg depvar $controls $dd $ddd
> 
> 
> However, using the same data set and the following version 9 code
> 
> . svyset psuid [pweight=newwgt], strata(stratum)
> . svy linearized: reg depvar $controls $dd $ddd
> 
> I get the following warning:
> 
> . Warning: variance matrix is nonsymmetric or highly singular
> 
> and output contains only regression coefficients (which exactly
> correspond to those obtained from the version 8 command), but no
> standard errors.
> 
> Does anyone know what changes have been implemented in Stata 9 as
> opposed to Stata 8 concerning the Taylor linearized variance estimation?
> 
> Any help will be greatly appreciated. 

The message

	Warning: variance matrix is nonsymmetric or highly singular

is most likely due to one or more sparse indicator variables.

By sparse indicator, I mean a variable that takes on the values 0 or 1 (or
missing) and is 1 for a very small proportion of the observations.  The best
example is an indicator variable that identifies 1 observation; this will
invalidate the large sample theory that the robust variance estimator depends
upon for the coefficient on this variable -- however, you can simply remove
this variable from the model to solve the problem.

-svyreg- doesn't detect this problem because it doesn't check e(V) as
thoroughly as the -svy- prefix when the variance matrix is posted to e().

Elke can use the -total- command to find sparse the indicator variables:

	. total $controls $dd $ddd

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index