Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: test for significant change of a serier


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: test for significant change of a serier
Date   Thu, 11 Jan 2007 15:16:51 +0000

The argument for logit is correct in principle, but over
the range from 0.14 to 0.15 logit of a proportion is as near
linear as is needed for almost all practical purposes.
In fact, forget the "almost". This really is a detail
compared with others.

If you are going to transform, note that Stata has a -logit()-
function. I prefer to do it by -glm-:

. glm index year

Iteration 0:   log likelihood =  51.765713

Generalized linear models                          No. of obs      =
    10
Optimization     : ML: Newton-Raphson              Residual df     =
     8
                                                   Scale parameter =
2.33e-06
Deviance         =   .000018673                    (1/df) Deviance =
2.33e-06
Pearson          =   .000018673                    (1/df) Pearson  =
2.33e-06

Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = u                        [Identity]
Standard errors  : OIM

Log likelihood   =  51.76571288                    AIC             =
-9.953143
BIC              = -18.42066207

------------------------------------------------------------------------------
       index |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
        year |  -.0007992   .0001682    -4.75   0.000    -.0011289
-.0004695
       _cons |   1.741015   .3359864     5.18   0.000     1.082493
2.399536
------------------------------------------------------------------------------

. glm index year , link(logit)

Iteration 0:   log likelihood =  51.745225
Iteration 1:   log likelihood =  51.745602
Iteration 2:   log likelihood =  51.745602

Generalized linear models                          No. of obs      =
    10
Optimization     : ML: Newton-Raphson              Residual df     =
     8
                                                   Scale parameter =
2.34e-06
Deviance         =  .0000187482                    (1/df) Deviance =
2.34e-06
Pearson          =  .0000187482                    (1/df) Pearson  =
2.34e-06

Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = ln(u/(1-u))              [Logit]
Standard errors  : OIM

Log likelihood   =  51.74560229                    AIC             =
-9.94912
BIC              =   -18.420662

------------------------------------------------------------------------------
       index |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
        year |   -.006451   .0013616    -4.74   0.000    -.0091197
-.0037822
       _cons |   11.10835   2.719779     4.08   0.000     5.777684
16.43902
------------------------------------------------------------------------------

Yoking Herfindahl and Hirschman is not appropriate here
as their measures differ. Herfindahl's measure, or its
complement, is also known as the Gini index (one of several),
heterozygosity, Simpson's index, etc.

Nick
n.j.cox@durham.ac.uk

Clive Nicholas replied to Xiaoheng 'Kevin' Zhang

I have a serier of index for 10 years. It is a Herfindahl index of
concentration and I would like to test if the change of this index over
time is significant.
I am not sure how to translate this real problem into a statistics
problem. Since it looks like a decreasing trend, I used linear regression
of index on year and found the slope is statistically different from 0.
But I am worrying about sample size......

The indices are
year      index
1993      0.149552855
1994      0.146646187
1995      0.143958559
1996      0.145009261
1997      0.147389484
1998      0.145309026
1999      0.144218297
2000      0.142834716
2001      0.140957544
2002      0.140444707
There are two things about the Herfindahl-Hirschman index of market
concentration (to give it its full title), and its use as a response
variable in OLS that you need to be aware of:

(1) Since the index (H) is a fixed 0-1 scale, where 0 = perfect competition
    and 1 = a monopoly, the use of -reg- is invalid under the Gauss-Markov
    assumptions underpinning OLS;

and

(2) calculating the logit transformation of H gives you a new index (H*)
    whose scale stretches from -infinity to +infinity. This makes it a much
    more useful - and valid - index for OLS model fitting. Unlike H's
scale,
    H*'s scale is also _linear_.

Inputting your data and generating H*

. clear

. input year index

< snip >

. g logindex=ln(index/(1-index))

and then looking at the relationship graphically via

. twoway line logindex year

shows that H* decreased by nearly -0.08 over 10 years, indicating that
competition within whatever market you're measuring _increased_. But was
that decrease in H* statistically significant over this period?

. reg logindex year, eform(OR)

      Source |       SS       df       MS            Number of obs =
   10
-------------+------------------------------         F(  1,     8) =   22.75
       Model |  .003446335     1  .003446335         Prob > F      =
0.0014
    Residual |  .001211688     8  .000151461         R-squared     =
0.7399
-------------+------------------------------         Adj R-squared =  0.7074
       Total |  .004658023     9  .000517558         Root MSE      =
.01231

----------------------------------------------------------------------------
  logindex |         OR   Std. Err.      t    P>|t|     [95% Conf.
Interval]
-----------+----------------------------------------------------------------
      year |   .9935576   .0013462    -4.77   0.001      .990458
.9966668
----------------------------------------------------------------------------

Yes: H* significantly decreased by six-thousandths of 1 percent every year
in the period (notice the use of the -eform()- option to obtain this).
Whether this is important enough to care about is, of course, your call.
Although there doesn't appear to be any real improvement in model fit over
the standard OLS model I suspect you fitted (R^2 for I = .7056), you are
at least fitting a much more valid model. The model fit itself is pretty
impressive.

But then there's the pesky problem of your small N. The only way to
improve this is by having more data (you don't say where this data comes
from). Do you have it? Also, other variables need to be used if they're
available: e.g., if this is market data, then information on, say, whether
any new laws tightening or relaxing market competition would be very
useful to have.


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index