Re: st: model for fractional data with panel data

 From Daniel Simon To statalist@hsphsun2.harvard.edu Subject Re: st: model for fractional data with panel data Date Wed, 07 Nov 2007 13:37:47 -0500

Austin - just a quick follow-up question. In my case (I was not the original poster on this thread), I do have a non-trivial number of cases where y=0. Therefore, the glm approach with fixed effects seems appropriate - yes? thanks. Daniel

At 11:56 AM 11/7/2007 -0500, Austin Nichols wrote:

```Arne--
I'm not sure the concern about incidental parameters applies here.  To
my mind, the question is, is there anything to be gained by using
-glm- with indicator variables to capture fixed effects to estimate
instead of transforming y by generating a new variable lny=ln(y) or
logity=logit(y) or invlogity=invlogit(y) and I'm not sure there is, in
this case.  The poster specified that y measured proportions strictly
between 0 and 1, i.e. on the open interval.  That is the crucial
point--there are no obs with y=0 or y=1.  In this case, you may be
better off with -xtreg- (or -xtivreg2- with more SE adjustments) than
-glm- if only because estimation is so much faster!  But you will get
since y=f(Xb+e) is not the same as y=f(Xb)+e

webuse psidextract, clear
tsset id t
gen w=wks/53
g ilw=invlogit(w)
qui su ilw
replace ilw=ilw/r(sd)
qui reg ilw lw uni south smsa, cluster(id)
est sto reg
qui glm w lw uni south smsa, link(logit) fam(gauss) cl(id)
est sto glm
qui xtreg ilw lw uni south smsa, cluster(id) fe
est sto xtreg
qui xi: glm w lw uni sou sms i.id, link(logit) fam(gauss) cl(id)
est sto xtglm
esttab *, keep(lwage union south smsa) mti

----------------------------------------------------------------------------
(1)             (2)             (3)             (4)
reg             glm           xtreg           xtglm
----------------------------------------------------------------------------
main
lwage               0.139*          0.127*         0.0598           0.162
(2.41)          (2.03)          (0.83)          (1.55)

union              -0.309***       -0.286***        0.158           0.171
(-6.09)         (-6.22)          (1.33)          (1.38)

south              0.0361          0.0404          -0.122          -0.275
(0.67)          (0.76)         (-0.66)         (-1.16)

smsa               0.0242          0.0176          0.0304          0.0468
(0.45)          (0.35)          (0.35)          (0.38)
----------------------------------------------------------------------------
N                    4165            4165            4165            4165
----------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Having accepted you might transform y, the question then is which
transformation is appropriate, and for that you need some theory.
Neglecting theory, you might explore whether regressions using
lny=ln(y) or logity=logit(y) or invlogity=invlogit(y) as the depvar
produce predictions that make more sense and residuals that look less

tw function y=50*invlogit(x)-31||function y=logit(x)||function y=ln(x)

On 11/7/07, Arne Risa Hole <arnehole@gmail.com> wrote:
> There was an extremely useful discussion on the list recently about
> this issue in the context of fixed effects binary logit models. In
> short, adding the fixed effects 'by hand' results in biased estimates
> unless the number of time periods is large. See the thread starting
> with:
>
> http://www.stata.com/statalist/archive/2007-10/msg00935.html
>
> Arne
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```