Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Negative binomial regression with exposure and predictors correlated

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Negative binomial regression with exposure and predictors correlated Date Tue, 31 Aug 2010 10:31:45 -0400

```Paul <paul.seed@kcl.ac.uk>:
Why not just include all RHS variables in logs rather than forming a
ratio on the RHS?  If the "exposure" is supposed to have a coef of
one, that will be estimated; if the numerator and denominator of your
ratio are supposed to have coefs of equal size and opposite sign, they
will be estimated so. I would lean toward -xtpoisson, fe- myself. You
might also look for "Granger causality" by hand, i.e. see if a higher
outcome value predicts later higher treatment.  I would guess that
endogeneity is a pervasive problem here--what is the setting in which
these practices are getting patients and making decisions about
prescriptions?

On Tue, Aug 31, 2010 at 7:41 AM, Seed, Paul <paul.seed@kcl.ac.uk> wrote:
> Dear Statalist,
>
> I am struggling with a rather tricky modelling problem & would greatly appreciate any thoughts.
>
> I wish to know whether the introduction of a new treatment in Primary Care can be linked to a fall in Hospital admissions.  However, all my data is at the level of the practice, not the patient or patient group.  I therefore use negative binomial regression, with the number of patients as the exposure.
>
> The main predictor is the rate of prescribing, estimated as the total cost of prescriptions for the drug of interest, divided by number of patients in each practice.  After correcting for age, gender & other drugs used, I find a strong paradoxical effect of more prescribing associated with more hospital admissions.  If correct, it would appear the drug is doing harm!
>
> However, the number of patients with the condition per practice appears twice in the model (as divisor and as exposure), so the effect may be an artefact.
>
> As a sensitivity analysis, I can use a variety of different exposures and divisors:
> E_diag - the number of patients with the diagnosis recorded
> E_50y - The total number of patients over 50 (the condition is rarely seen below this age)
> E_Pred - The predicted number of patients affected, based on the age & gender profile of the practice (typically 10 patients affected for 1 diagnosed).
>
> This gives 9 possible combinations of exposure and divisor.  When the exposure and the divisor are both the same do I get the significant result. But I also get a significant result when using E_50y and E_pred together.  ( A total of 5 results significant out of 9)
>
> One further complication:  I actually have data repeated for 3 years. The results above generally hold when looking at one year at a time.  When I combine the data & use -xtpoisson, fe-, instead of -nbreg-, only one comparison (matching E_diag with E_diag) remains significant.
>
> 2 questions (for those of you have read this far)
> *       Is there a better model to use than -nbreg- or -xtpoisson, fe-? (NOTE: xtnbreg does not generally converge, but when it does the answers are similar)
> *       Is it safe to ignore the anomalous result and conclude that there is no evidence of an effect ?
>
> I can of course supply code and output if required, but I think I have taken up enough bandwidth.
>
> Best Wishes,
>
>
> Paul Seed
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```