[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: model for fractional data with panel data

From   "Austin Nichols" <>
Subject   Re: st: model for fractional data with panel data
Date   Wed, 7 Nov 2007 11:56:42 -0500

I'm not sure the concern about incidental parameters applies here.  To
my mind, the question is, is there anything to be gained by using
-glm- with indicator variables to capture fixed effects to estimate
instead of transforming y by generating a new variable lny=ln(y) or
logity=logit(y) or invlogity=invlogit(y) and I'm not sure there is, in
this case.  The poster specified that y measured proportions strictly
between 0 and 1, i.e. on the open interval.  That is the crucial
point--there are no obs with y=0 or y=1.  In this case, you may be
better off with -xtreg- (or -xtivreg2- with more SE adjustments) than
-glm- if only because estimation is so much faster!  But you will get
numerically different answers, of course...
since y=f(Xb+e) is not the same as y=f(Xb)+e

webuse psidextract, clear
tsset id t
gen w=wks/53
g ilw=invlogit(w)
qui su ilw
replace ilw=ilw/r(sd)
qui reg ilw lw uni south smsa, cluster(id)
est sto reg
qui glm w lw uni south smsa, link(logit) fam(gauss) cl(id)
est sto glm
qui xtreg ilw lw uni south smsa, cluster(id) fe
est sto xtreg
qui xi: glm w lw uni sou sms, link(logit) fam(gauss) cl(id)
est sto xtglm
esttab *, keep(lwage union south smsa) mti

                      (1)             (2)             (3)             (4)
                      reg             glm           xtreg           xtglm
lwage               0.139*          0.127*         0.0598           0.162
                   (2.41)          (2.03)          (0.83)          (1.55)

union              -0.309***       -0.286***        0.158           0.171
                  (-6.09)         (-6.22)          (1.33)          (1.38)

south              0.0361          0.0404          -0.122          -0.275
                   (0.67)          (0.76)         (-0.66)         (-1.16)

smsa               0.0242          0.0176          0.0304          0.0468
                   (0.45)          (0.35)          (0.35)          (0.38)
N                    4165            4165            4165            4165
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Having accepted you might transform y, the question then is which
transformation is appropriate, and for that you need some theory.
Neglecting theory, you might explore whether regressions using
lny=ln(y) or logity=logit(y) or invlogity=invlogit(y) as the depvar
produce predictions that make more sense and residuals that look less
correlated with your transformed depvar.

tw function y=50*invlogit(x)-31||function y=logit(x)||function y=ln(x)

On 11/7/07, Arne Risa Hole <> wrote:
> There was an extremely useful discussion on the list recently about
> this issue in the context of fixed effects binary logit models. In
> short, adding the fixed effects 'by hand' results in biased estimates
> unless the number of time periods is large. See the thread starting
> with:
> Arne
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index