[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

# Re: st: Sparse Data Problem

 From john metcalfe To statalist@hsphsun2.harvard.edu Subject Re: st: Sparse Data Problem Date Fri, 6 Mar 2009 20:48:52 -0800

```I was referring to Greenland Amer J Epi 2000.
Thanks for the tip.
John

On Fri, Mar 6, 2009 at 8:07 PM, David Airey <david.airey@vanderbilt.edu> wrote:
> .
>
> What do you mean when you said "not fully accounting for the small cell
> bias"? I don't understand. I thought exact logistic models were for
> situations with small cells. -nestreg- does nested estimations for logit
> models, though not exact logit models. It was added to Stata in June of
> 2008.
>
> -Dave
>
> On Mar 6, 2009, at 9:12 PM, john metcalfe wrote:
>
>> Dear Statalist,
>> I am analyzing a small data set with outcome of interest 'clstr', with
>> the primary goal of the analysis to determine if the variables 's315t'
>> and 'east' have independent associations with the outcome.  However,
>> 2315t is highly deterministic for the outcome clstr, as below. I am
>> concerned that exact logistic regression is not fully accounting for
>> the small cell bias. I would like to employ a hierarchical logistic
>> regression, but it seems that the stata command 'hireg' is only for
>> linear linear regressions??
>> It may be that I simply am unable to make any valid inferences with
>> this dataset, but I just want to make sure I have explored the
>> appropriate possible remedies.
>> Thanks,
>> John
>>
>> John Metcalfe, M.D., M.P.H.
>> University of California, San Francisco
>>
>>
>> . tab s315 clstr,e
>>
>>          |         clstr
>>    s315t |         0          1 |     Total
>> -----------+----------------------+----------
>>        0 |        22          1 |        23
>>        1 |        58         32 |        90
>> -----------+----------------------+----------
>>    Total |        80         33 |       113
>>
>>          Fisher's exact =                 0.002
>>  1-sided Fisher's exact =                 0.002
>>
>>
>>
>>
>> . logit clstr ageat s315t east emb sm num,or
>>
>> Iteration 0:   log likelihood = -62.686946
>> Iteration 1:   log likelihood = -51.860098
>> Iteration 2:   log likelihood = -50.754342
>> Iteration 3:   log likelihood = -50.661741
>> Iteration 4:   log likelihood = -50.660257
>> Iteration 5:   log likelihood = -50.660256
>>
>> Logistic regression                               Number of obs   =
>>  100
>>                                                 LR chi2(6)      =
>>  24.05
>>                                                 Prob > chi2     =
>> 0.0005
>> Log likelihood = -50.660256                       Pseudo R2       =
>> 0.1919
>>
>>
>> ------------------------------------------------------------------------------
>>      clstr | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>  ageatrept |   .9908837   .0139884    -0.65   0.517     .9638428
>>  1.018683
>>      s315t |   9.238959   10.28939     2.00   0.046     1.041462
>>  81.96011
>>  east_asian |   4.219755   2.215279     2.74   0.006     1.508083
>>  11.80727
>>        emb |   .9964845   .6599534    -0.01   0.996     .2721043
>>  3.649268
>>         sm |   2.138175   1.696319     0.96   0.338      .451589
>>  10.12379
>>  num_resist |   1.064089   .2385192     0.28   0.782     .6857694
>>  1.651116
>>
>> ------------------------------------------------------------------------------
>>
>>
>>
>> Strategy 1: Two-way contingency tables
>>
>> . tab clstr s315t if east==1,e
>>
>>          |         s315t
>>    clstr |         0          1 |     Total
>> -----------+----------------------+----------
>>        0 |         6         19 |        25
>>        1 |         1         24 |        25
>> -----------+----------------------+----------
>>    Total |         7         43 |        50
>>
>>          Fisher's exact =                 0.098
>>  1-sided Fisher's exact =                 0.049
>>
>> . tab clstr s315t if east==0,e
>>
>>          |         s315t
>>    clstr |         0          1 |     Total
>> -----------+----------------------+----------
>>        0 |        12         33 |        45
>>        1 |         0          8 |         8
>> -----------+----------------------+----------
>>    Total |        12         41 |        53
>>
>>          Fisher's exact =                 0.175
>>  1-sided Fisher's exact =                 0.108
>>
>>
>>
>> Strategy 2: Exact Logistic Regression
>>
>> observation 102: enumerations =       1128
>> observation 103: enumerations =        574
>>
>> Exact logistic regression                        Number of obs =       103
>>                                                Model score   =  19.78112
>>                                                Pr >= score   =    0.0000
>>
>> ---------------------------------------------------------------------------
>>      clstr | Odds Ratio       Suff.  2*Pr(Suff.)     [95% Conf. Interval]
>>
>> -------------+-------------------------------------------------------------
>>      s315t |   10.44218          32      0.0135      1.391627    474.4786
>>  east_asian |   5.414021          25      0.0006      1.933718    16.65417
>>
>>
>>
>>
>> (output omitted)
>> observation 103: enumerations =        574
>>
>> Exact logistic regression                        Number of obs =       103
>>                                                Model score   =  19.78112
>>                                                Pr >= score   =    0.0000
>>
>> ---------------------------------------------------------------------------
>>      clstr |      Coef.       Score    Pr>=Score     [95% Conf. Interval]
>>
>> -------------+-------------------------------------------------------------
>>      s315t |   2.345854    6.763266      0.0129      .3304732    6.162216
>>  east_asian |   1.688992    12.98631      0.0004      .6594448    2.812661
>>
>> ---------------------------------------------------------------------------
>>
>>
>> Strategy 3: Hierarchical Regression
>>
>> . hireg clstr (s315t) (east)(ageat emb sm)
>>
>> Model 1:
>>  Variables in Model:
>>  Adding            : s315t
>>
>>     Source |       SS       df       MS              Number of obs =
>> 113
>> -------------+------------------------------           F(  1,   111) =
>>  9.18
>>      Model |   1.7840879     1   1.7840879           Prob > F      =
>>  0.0030
>>   Residual |   21.578744   111  .194403099           R-squared     =
>>  0.0764
>> -------------+------------------------------           Adj R-squared =
>>  0.0680
>>      Total |  23.3628319   112  .208596713           Root MSE      =
>>  .44091
>>
>>
>> ------------------------------------------------------------------------------
>>      clstr |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>      s315t |   .3120773   .1030162     3.03   0.003     .1079438
>>  .5162108
>>      _cons |   .0434783   .0919364     0.47   0.637    -.1386999
>>  .2256565
>>
>> ------------------------------------------------------------------------------
>>
>> Model 2:
>>  Variables in Model: s315t
>>  Adding            : east
>>
>>     Source |       SS       df       MS              Number of obs =
>> 103
>> -------------+------------------------------           F(  2,   100) =
>> 12.03
>>      Model |  4.34936038     2  2.17468019           Prob > F      =
>>  0.0000
>>   Residual |  18.0778241   100  .180778241           R-squared     =
>>  0.1939
>> -------------+------------------------------           Adj R-squared =
>>  0.1778
>>      Total |  22.4271845   102  .219874358           Root MSE      =
>>  .42518
>>
>>
>> ------------------------------------------------------------------------------
>>      clstr |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>      s315t |   .2817301   .1086887     2.59   0.011     .0660947
>>  .4973654
>>  east_asian |   .3247109   .0843486     3.85   0.000     .1573656
>>  .4920561
>>      _cons |  -.0669987   .1023736    -0.65   0.514     -.270105
>>  .1361075
>>
>> ------------------------------------------------------------------------------
>> R-Square Diff. Model 2 - Model 1 = 0.118   F(1,100) = 14.190  p = 0.000
>>
>> Model 3:
>>  Variables in Model: s315t  east
>>  Adding            : ageat emb sm
>>
>>     Source |       SS       df       MS              Number of obs =
>> 100
>> -------------+------------------------------           F(  5,    94) =
>>  4.72
>>      Model |  4.36538233     5  .873076466           Prob > F      =
>>  0.0007
>>   Residual |  17.3946177    94  .185049124           R-squared     =
>>  0.2006
>> -------------+------------------------------           Adj R-squared =
>>  0.1581
>>      Total |       21.76    99   .21979798           Root MSE      =
>>  .43017
>>
>>
>> ------------------------------------------------------------------------------
>>      clstr |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>      s315t |   .2335983   .1163422     2.01   0.048     .0025981
>>  .4645984
>>  east_asian |   .2694912   .0945411     2.85   0.005     .0817777
>>  .4572048
>>  ageatrept |  -.0012444   .0024199    -0.51   0.608    -.0060491
>>  .0035603
>>        emb |   .0396897   .0989203     0.40   0.689    -.1567189
>>  .2360984
>>         sm |   .1063985   .1087626     0.98   0.330    -.1095522
>>  .3223492
>>      _cons |  -.0454117   .1512602    -0.30   0.765    -.3457423
>> .254919
>>
>> ------------------------------------------------------------------------------
>> R-Square Diff. Model 3 - Model 2 = 0.007   F(3,94) =  0.029  p = 0.993
>>
>>
>> Model  R2      F(df)              p         R2 change  F(df) change
>> p
>>  1:  0.076   9.177(1,111)       0.003
>>  2:  0.194  12.030(2,100)       0.000     0.118     14.190(1,100)
>> 0.000
>>  3:  0.201   4.718(5,94)        0.001     0.007      0.029(3,94)
>>  0.993
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

 © Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index