# st: predicted proportions greater than 1 using -adjust- after GLM family(binomial) link (logit)

 From Gina Bilenkij To "statalist@hsphsun2.harvard.edu" Subject st: predicted proportions greater than 1 using -adjust- after GLM family(binomial) link (logit) Date Wed, 25 Feb 2009 11:34:00 +1100

```This is my first posting to statalist- will do my best to be clear. I am a public health PhD student, so I am still learning the basics of statistics and Stata.

I am running an analysis of some expenditure data using several (14) dependant variables, which are the expenditure on different items as a proportion of total expenditure (range 0-1). I am interested in the association between income and the patterns of expenditure for these items.

I am using a -GLM family(binomial) link (logit)- (As suggested but the FAQ "How do you fit a model when the dependent variable is a proportion?" and the Stata tip 63 by Baum 2008)

The dependant variables are of 2 different types (individual items and aggregates), so are
A1, A2, A3, A4, A5, A_total
B1, B2, B3, B4, B5, B6, B7, B_total

The independent variable of interest is income (quintiles)- and I am adjusting for 4 other categorical covariates

I am running a glm to look at between group differences, then hoping to convert the coefficients back to adjusted proportions to aid interpretation.

The code I am using is:

xi: glm A1 i.income i.x2 i.x3 i.x4 i.x5, family(binomial) link (logit) robust

adjust if x2==ref & x3==ref & x4==ref & x5==ref, by (income) exp ci

Everything is working well, except that the results for the aggregate proportion (A_total) when adjusted is larger than 1 - ranging between 1.09 and 1.24 for the quintiles. The mean of this variable is by far the largest and prior to running the GLM is around 0.58. The other aggregate (B_total) is about 0.43 when adjusted (mean prior to running GLM about 0.28). The adjusted proportions for all of the individual proportions (A1, A2 etc) seem to be fairly close to their (quite low) pre-GLM means.

I have plotted the deviance residuals and they appear to be close to normally distributed around 0- so from what I have read, my model fit should be OK.

Is there something strange going on that I am missing, or is it reasonable for adjusted proportions to go above 1 as the data is extrapolated from the GLM model?

I have searched statalist and found the thread "st:Binomial regression" in Aug 2007 that seems to be on a similar topic- but the glm link functions discussed are different, and with my elementary knowledge it is all a little over my head.

Any help would be appreciated.

Thanks,
Gina Bilenkij

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```