# Re: st: (Feasible) generalized least squares

 From Herb Smith <[email protected]> To [email protected] Subject Re: st: (Feasible) generalized least squares Date Tue, 16 Jan 2007 17:01:38 -0500 (EST)

```Well, yes and no...

Yes, in the sense that, yes, this is an FGLS estimator

No, in the sense that one has to tset and what have you, and I was
interested in a problem with grouped data (but not panel data).

To be concrete:  In a basic text, Powers and Xie, *Statistical Methods for
Categorical Data Analysis*, there is a simple table of six rates, for
three ages and two time periods

+----------------------------+
|   y       n   A2   A3   P2 |
|----------------------------|
1. |  19    1073    0    0    0 |
2. |  70    3084    1    0    0 |
3. | 134   18520    0    1    0 |
4. |  10     339    0    0    1 |
5. |  23     967    1    0    1 |
|----------------------------|
6. |  69    4611    0    1    1 |
+----------------------------+

Make the response variable ln_p = ln(y/n) .

Make a weight w = n*p / (1 - p)   where p = y/n

The matrix rendering of the FGLS estimator, and the estimated standard
errors (see below) is quite straightforward and yields the results shown
in Table 2.3 in their text; and you can also get the coefficients and the
correct standard errors "the old-fashioned way," which is to say
re-scaling all variables by multiplying them times sqrt(w), and then
adjusting the standard errors by dividing through by the RMSE; but the
only way that I have found to have Stata do this in one "canned" swoop is:

. glm  lograte  A2 A3 P2 [fweight=y], scale(1)

Iteration 0:   log likelihood = -301.55676

Generalized linear models                          No. of obs      =
325
Optimization     : ML                              Residual df     =
321
Scale parameter =
1
Deviance         =  5.803478257                    (1/df) Deviance =
.0180794
Pearson          =  5.803478257                    (1/df) Pearson  =
.0180794

Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = u                        [Identity]

AIC             =
1.880349
Log likelihood   = -301.5567624                    BIC             =
-1850.804

------------------------------------------------------------------------------
|                 OIM
lograte |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
A2 |   .1362063   .2130081     0.64   0.523    -.2812819
.5536945
A3 |   -.821337   .1985175    -4.14   0.000    -1.210424
-.4322497
P2 |   .5366818   .1200295     4.47   0.000     .3014284
.7719353
_cons |  -4.042851   .1902521   -21.25   0.000    -4.415739
-3.669964
------------------------------------------------------------------------------
(Standard errors scaled using dispersion equal to square root of 1)

which strikes me as kind of ugly since it is iterative and involves a
weight that only happens to resemble w because p is close to zero!  (This
is the Stata analogue of the GENMOD commands that Powers has on his web
site for this example....)

As I say below, what I am looking for is a routine that does

var(b-gls) = invsym(X'*W*X)

without doing ML, or having to fake panel data, etc.  If it doesn't exist,
it doesn't exist, and it is simple enough to write one....

--Herb

Herbert L. Smith
Professor of Sociology and
Director, Population Studies Center
230 McNeil Building
3718 Locust Walk CR
University of Pennsylvania

[email protected]

215.898.7768 (office)
215.898.2124 (fax)

On Tue, 16 Jan 2007, Clive Nicholas wrote:

> Herbert Smith wrote:
>
> > For a garden-variety, cross-sectional regression, an estimator of
> >
> > var(b)
> >
> > is
> >
> > var(b)=invsym(X'*W*X)
> >
> > where X is the design matrix and W is a diagonalized weight matrix.
> >
> > Is there a way in Stata to get the FGLS estimated var-cov in a single
> > command?  By which I mean:
> >
> > -regress depvar indvars [pweight=w]-
> >
> > gives the GLS estimates for b
> >
> > b=invsym(X'*W*X)*(X'*W*y)
> >
> > but the standard errors are computed as though
> >
> > -regress depvar indvars [pweight=w], vce(robust)-
> >
> > and are close to the FGLS estimates, but are not the same....
>
> Isn't this satisfactory?
>
> . webuse grunfeld, clear
>
> . tsset company year
>        panel variable:  company (strongly balanced)
>         time variable:  year, 1935 to 1954
>
> . xtgls invest mvalue kstock time
>
> Cross-sectional time-series FGLS regression
>
> Coefficients:  generalized least squares
> Panels:        homoskedastic
> Correlation:   no autocorrelation
>
> Estimated covariances      =         1        Number of obs      =       200
> Estimated autocorrelations =         0        Number of groups   =        10
> Estimated coefficients     =         4        Time periods       =        20
>                                               Wald chi2(3)       =    867.82
> Log likelihood             = -1191.645        Prob > chi2        =    0.0000
>
> ----------------------------------------------------------------------------
>     invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -----------+----------------------------------------------------------------
>     mvalue |   .1163783   .0059669    19.50   0.000     .1046834    .1280732
>     kstock |   .2213351   .0302499     7.32   0.000     .1620463    .2806239
>       time |   .7737904   1.377808     0.56   0.574    -1.926665    3.474245
>      _cons |  -49.14306   14.83261    -3.31   0.001    -78.21443   -20.07169
> ----------------------------------------------------------------------------
>
> . matrix list e(V)
>
> symmetric e(V)[4,4]
>             mvalue      kstock        time       _cons
> mvalue    .0000356
> kstock  -.00009563   .00091506
>   time   .00200231  -.02292234   1.8983561
>  _cons  -.03314052   .09155466  -15.771641   220.00619
>
> Or am I missing something? :)
>
> CLIVE NICHOLAS        |t: 0(044)7903 397793
> Politics              |e: [email protected]
> Newcastle University  |http://www.ncl.ac.uk/geps
>
> Whereever you go and whatever you do, just remember this. No matter how
> many like you, admire you, love you or adore you, the number of people
> turning up to your funeral will be largely determined by local weather
> conditions.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```