# Re: st: glogit versus OLS on logistic dv

 From SamL To statalist@hsphsun2.harvard.edu Subject Re: st: glogit versus OLS on logistic dv Date Thu, 12 Sep 2002 07:00:30 -0700 (PDT)

```I am replying to your note because it doesn't seem anyone else has done so
so far.  However, I don't know whether glogit or OLS can address your
substantive problem.  I only offer the following observation.  If you know
the counts from which the proportions are calculated, it is usually
regarded as better to model the counts directly.  So, in your case, you
would use the count of homes receiving services as the dependent variable,
and you would use the total number of homes per unit of analysis as an
offset.  In the negative binomial regression model (nbreg) or the poisson
model this would transform the model of counts into a model of rates.  It
is my understanding that these are more appropriate strategies for your
problem than is OLS on a transformed variable.  I don't know what gprobit
or glogit do, but that approach seems similar to the one outlined above,
at least with respect to the use of an offset and actual counts. Of
course, I do not know anything about the substance of your study or the

Take care.
Sam

On Wed, 11 Sep 2002, Matthew R. Cleary wrote:

> All,
>
> I'm hoping someone might know why I'm getting vastly different results from
> two approaches to the same estimation problem.
>
> My cases are cities, and my d.v. is the proportion of homes within each
> city that has access to certain public services.  Straight OLS is not
> appropriate for a proportional d.v., but I can transform the d.v. to the
> logistic, ln(p/(1-p)) and use OLS with weights (Greene, 6.3).
>
> Since I know the counts from which the proportions are calculated, I should
> also be able to use glogit (or gprobit), where the d.v. is the # of homes
> with the service and the weight variable (called the popvar in the manual)
> is total homes per city.  Greene (19.4.3) seems to suggest that this is a
> comparable method to the logistic-OLS approach.
>
> However, the two models return different results, not just in terms of
> standard errors, which I might expect, but in terms of coefficient
> estimates.  The coeffs for several of the main i.v.s of interest change
> size, sign, and significance in the two models.  Does anyone know why this
> would be?  Is one or the other model inappropriate for the estimation
> problem I've described?
>
> Thanks,
> Matt Cleary
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```