Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: ?Misreported n of observations when using pweights in Stata 10


From   Iain Lang <iain.lang@pms.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: ?Misreported n of observations when using pweights in Stata 10
Date   Tue, 11 Aug 2009 16:52:35 +0100

Dear List Members

I am encountering an odd problem when using pweights in regressions in Stata 10. The problem, which I've now come across when using data from two different epidemiological studies (NHANES and the Health and Retirement Study), is that applying pweights seems to alter the number of observations reported as included in the analysis.

An example is pasted below. When I saved this dataset in Stata 9 format and opened it there the problem did not replicate - in fact, all the outputs were the same *except* for the number of observations. Given that this has happened to me in two different datasets and in Stata 10 but not Stata 9 I'm guessing that it's something specific to Stata 10. A colleague has encountered something similar. Has anyone else come across this problem? I'm not too worried about this since the estimates, etc., are the same as in Stata 9 but if anyone else has encountered this I'd be interested to hear about it.

With regards
Iain


Example output:

. reg hrs06cog35 age  gender

      Source |       SS       df       MS              Number of obs =   10478
-------------+------------------------------           F(  2, 10475) =  206.84
       Model |  10680.3419     2  5340.17097           Prob > F      =  0.0000
    Residual |  270441.234 10475  25.8177789           R-squared     =  0.0380
-------------+------------------------------           Adj R-squared =  0.0378
       Total |  281121.576 10477  26.8322588           Root MSE      =  5.0811

------------------------------------------------------------------------------
  hrs06cog35 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   -.121624   .0061161   -19.89   0.000    -.1336128   -.1096353
      gender |   .5013875   .1005095     4.99   0.000     .3043697    .6984053
       _cons |   29.77283   .4344999    68.52   0.000     28.92113    30.62454
------------------------------------------------------------------------------

. svyset [pweight=EWGTR]

      pweight: EWGTR
          VCE: linearized
  Single unit: missing
     Strata 1: <one>
         SU 1: <observations>
        FPC 1: <zero>

. svy: reg hrs06cog35 age  gender
(running regress on estimation sample)

Survey: Linear regression

Number of strata   =         1                  Number of obs      =     27821
Number of PSUs     =     27821                  Population size    =  15217686
                                                Design df          =     27820
                                                F(   2,  27819)    =     51.63
                                                Prob > F           =    0.0000
                                                R-squared          =    0.0203

------------------------------------------------------------------------------
             |             Linearized
  hrs06cog35 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.1533668   .0207634    -7.39   0.000     -.194064   -.1126695
      gender |   .8502326   .1253616     6.78   0.000     .6045176    1.095948
       _cons |   32.94872   1.376334    23.94   0.000     30.25104    35.64641
------------------------------------------------------------------------------



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index