# st: ?Misreported n of observations when using pweights in Stata 10

 From Iain Lang To "statalist@hsphsun2.harvard.edu" Subject st: ?Misreported n of observations when using pweights in Stata 10 Date Tue, 11 Aug 2009 16:52:35 +0100

```Dear List Members

I am encountering an odd problem when using pweights in regressions in Stata 10. The problem, which I've now come across when using data from two different epidemiological studies (NHANES and the Health and Retirement Study), is that applying pweights seems to alter the number of observations reported as included in the analysis.

An example is pasted below. When I saved this dataset in Stata 9 format and opened it there the problem did not replicate - in fact, all the outputs were the same *except* for the number of observations. Given that this has happened to me in two different datasets and in Stata 10 but not Stata 9 I'm guessing that it's something specific to Stata 10. A colleague has encountered something similar. Has anyone else come across this problem? I'm not too worried about this since the estimates, etc., are the same as in Stata 9 but if anyone else has encountered this I'd be interested to hear about it.

With regards
Iain

Example output:

. reg hrs06cog35 age  gender

Source |       SS       df       MS              Number of obs =   10478
-------------+------------------------------           F(  2, 10475) =  206.84
Model |  10680.3419     2  5340.17097           Prob > F      =  0.0000
Residual |  270441.234 10475  25.8177789           R-squared     =  0.0380
-------------+------------------------------           Adj R-squared =  0.0378
Total |  281121.576 10477  26.8322588           Root MSE      =  5.0811

------------------------------------------------------------------------------
hrs06cog35 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |   -.121624   .0061161   -19.89   0.000    -.1336128   -.1096353
gender |   .5013875   .1005095     4.99   0.000     .3043697    .6984053
_cons |   29.77283   .4344999    68.52   0.000     28.92113    30.62454
------------------------------------------------------------------------------

. svyset [pweight=EWGTR]

pweight: EWGTR
VCE: linearized
Single unit: missing
Strata 1: <one>
SU 1: <observations>
FPC 1: <zero>

. svy: reg hrs06cog35 age  gender
(running regress on estimation sample)

Survey: Linear regression

Number of strata   =         1                  Number of obs      =     27821
Number of PSUs     =     27821                  Population size    =  15217686
Design df          =     27820
F(   2,  27819)    =     51.63
Prob > F           =    0.0000
R-squared          =    0.0203

------------------------------------------------------------------------------
|             Linearized
hrs06cog35 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |  -.1533668   .0207634    -7.39   0.000     -.194064   -.1126695
gender |   .8502326   .1253616     6.78   0.000     .6045176    1.095948
_cons |   32.94872   1.376334    23.94   0.000     30.25104    35.64641
------------------------------------------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```