# [no subject]

```I'm using Stata 9.2.

Many thanks for your time and interest.

Angel Rodriguez-Laso

> Angel Rodriguez-Laso <angelrlaso@gmail.com>
>
>> I'm confused with the following results:
>>
>>
>>
>> . svyset psu [pweight=weight2007], strata(healtharea)fpc(psusperhealtharea)
>>
>>       pweight: weight2007
>>           VCE: linearized
>>      Strata 1: healtharea
>>          SU 1: psu
>>         FPC 1: psusperhealtharea
>>
>> .
>> end of do-file
>>
>> . svy: tab p29, deff deft
>> (running tabulate on estimation sample)
>>
>> Number of strata   =        11                  Number of obs      =     12140
>> Number of PSUs     =      1266                  Population size    = 12134,139
>>                                                 Design df          =      1255
>>
>> -------------------------------------------------
>> Any permanent
>> disability | proportions         deff         deft
>> ----------+--------------------------------------
>>     0, no |       ,8887        -1981        ,9783
>>     1, yes |       ,1113        -1981        ,9783
>>           |
>>     Total |           1
>> -------------------------------------------------
>>   Key:  proportions  =  cell proportions
>>         deff         =  deff for variances of cell proportions
>>         deft         =  deft for variances of cell proportions
>>
>>
>>
>>
>> Why do I get large negative deff values? Deft resembles more what I
>> was expecting, but it should be the square root of deff and obviously
>> this is not the case. Do you have any explanation for these results?
>
> Stas Kolenikov <skolenik@gmail.com> already pointed out that the sampling
> weights appear to be normalized by the sample size.  In fact, the sum of the
> weights is less than the sample size.  When the first stage is sampled without
> replacement (i.e. the 'fpc()' in the above -svyset-), the 'deff' calculation
> is
>
>        deff = V_db / (1-n/W) V_srswr
>
> where 'V_db' is the design based variance estimate, 'V_srswr' is simple
> randome sample with replacement variance estimate, 'n' is the sample size, and
> 'W' is an estimate for the population size.  Here 'W' is the sum of the
> sampling weights.  Since Angel's sampling weights are normalized, they cannot
> be used to estimate the population size, thus the above 'deff' calculation is
> not valid.  Without knowing what population size, we can't compute a valid
> 'deff' statistic.
>
> On the other hand, the 'deft' calculation is
>
>        deft = sqrt( V_db / V_srswr )
>
> which does not need an estimate of the population size, and thus will always
> produce a valid value.
>
> We will look into changing -svy: tabulate- and -estat effects- to report
> missing values for 'deff' in the case where the 'W' calculation is less than
> or equal to 'n'.
>
> --Jeff
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```