Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Interpretation of Box-Cox Results

From	Charalambos Karagiannakis <[email protected]>
To	[email protected]
Subject	RE: st: Interpretation of Box-Cox Results
Date	Mon, 10 Dec 2012 09:42:59 +0200

Dear Mr. Nick and Mr. Yuval,

Thank you very much for your responses. 

Harris Karagiannakis



-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Yuval Arbel
Sent: Friday, December 07, 2012 10:43 PM
To: [email protected]
Subject: Re: st: Interpretation of Box-Cox Results

Dear Harris,

The box-cox is a problematic specification test. Note, that this theta is
highly susceptible to hetheroskedasticity and other econometric problems in
your data, particularly if this parameter appears in the dependent variable.
In addition, the problem you are talking about is precisely the classical
problem arises. In many cases, and as you can also see in Nick's example,
the 3 specifications (logarithmic, linear and reciprocal) will be rejected.

Kmenta (1997), for example, suggests an alternative test, which is based on
only two competitive specifications (e.g., linear and logarithmic). I
suggest you take a look on:

Jan Kmenta, Elements of Econometrics, Second Addition (1997), pp. 518-521

On Fri, Dec 7, 2012 at 8:00 PM, Nick Cox <[email protected]> wrote:
> Please send plain text only to Statalist. See
>
> <http://hsphsun3.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statali
> st.1212/date/article-258.html>
>
> for how your posting will appear to many list members. The importance 
> of sending plain text is explained in the FAQ.
>
> My guess is that you have a large sample size and that the best 
> transform is unclear. This is common enough. Consider the example 
> below my signature. P-values necessarily depend on sample size. You 
> are still at liberty to choose a transform indicated by low or even 
> the lowest chi-square.
>
> However, note that P-values depend on other assumptions too (notably
> independence) and that for modelling the marginal distribution of the 
> response is less important than is widely believed.
>
> Nick
>
> . sysuse auto, clear
> (1978 Automobile Data)
>
> . boxcox mpg
> Fitting comparison model
>
> Iteration 0:   log likelihood = -234.39434
> Iteration 1:   log likelihood = -228.26891
> Iteration 2:   log likelihood = -228.26777
> Iteration 3:   log likelihood = -228.26777
>
> Fitting full model
>
> Iteration 0:   log likelihood = -234.39434
> Iteration 1:   log likelihood = -228.26891
> Iteration 2:   log likelihood = -228.26777
> Iteration 3:   log likelihood = -228.26777
>
>                                                   Number of obs   =
74
>                                                   LR chi2(0)      =
0.00
> Log likelihood = -228.26777                       Prob > chi2     =
.
>
>
----------------------------------------------------------------------------
--
>          mpg |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
> -------------+--------------------------------------------------------
> -------------+--------
>       /theta |  -.3533898    .391631    -0.90   0.367    -1.120972
.4141927
> ----------------------------------------------------------------------
> --------
>
> Estimates of scale-variant parameters
> ----------------------------
>              |      Coef.
> -------------+--------------
> Notrans      |
>        _cons |   1.853957
> -------------+--------------
>       /sigma |   .0882471
> ----------------------------
>
> ---------------------------------------------------------
>    Test         Restricted     LR statistic      P-value
>     H0:       log likelihood       chi2       Prob > chi2
> ---------------------------------------------------------
> theta = -1      -229.60603         2.68           0.102
> theta =  0      -228.67835         0.82           0.365
> theta =  1      -234.39434        12.25           0.000
> ---------------------------------------------------------
>
>
> . expand 1000
> (73926 observations created)
>
> . boxcox mpg
> Fitting comparison model
>
> Iteration 0:   log likelihood = -234394.34
> Iteration 1:   log likelihood = -228268.91
> Iteration 2:   log likelihood = -228267.77
> Iteration 3:   log likelihood = -228267.77
>
> Fitting full model
>
> Iteration 0:   log likelihood = -234394.34
> Iteration 1:   log likelihood = -228268.91
> Iteration 2:   log likelihood = -228267.77
> Iteration 3:   log likelihood = -228267.77
>
>                                                   Number of obs   =
74000
>                                                   LR chi2(0)      =
0.00
> Log likelihood = -228267.77                       Prob > chi2     =
.
>
>
----------------------------------------------------------------------------
--
>          mpg |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
> -------------+--------------------------------------------------------
> -------------+--------
>       /theta |  -.3533898   .0123845   -28.53   0.000    -.3776629
-.3291167
> ----------------------------------------------------------------------
> --------
>
> Estimates of scale-variant parameters
> ----------------------------
>              |      Coef.
> -------------+--------------
> Notrans      |
>        _cons |   1.853957
> -------------+--------------
>       /sigma |   .0882471
> ----------------------------
>
> ---------------------------------------------------------
>    Test         Restricted     LR statistic      P-value
>     H0:       log likelihood       chi2       Prob > chi2
> ---------------------------------------------------------
> theta = -1      -229606.03      2676.51           0.000
> theta =  0      -228678.35       821.17           0.000
> theta =  1      -234394.34     12253.13           0.000
> ---------------------------------------------------------
>
>
> On Fri, Dec 7, 2012 at 2:40 PM, Charalambos Karagiannakis 
> <[email protected]> wrote:
>> Dear Statalist users,
>>
>>
>>
>> Hello. I run a Box-Cox transformation for only the dependent variable 
>> using the command boxcox and I would appreciate some help with the 
>> interpretation of the results.
>>
>> The Box-Cox transform parameter 'theta' turns out to be very close to 
>> zero and statistical significant (namely, -0.0730 with a s.e. of 0.0091).
>> However, at the bottom table where different null hypotheses for 
>> theta are tested, all three cases (H0:theta=-1, H0:theta=0, 
>> H0:theta=1) return a
>> 0.000
>> p-value, rejecting all the possible specifications (reciprocal, log 
>> and linear specification respectively). How could one interpret this
result?
>>
>>
>>
>> Thank you in advance.
>>
>> Harris Karagiannakis
>>
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/



--
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street,
Haifa 33031, Israel
e-mail1: [email protected]
e-mail2: [email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Interpretation of Box-Cox Results
  - From: Charalambos Karagiannakis <[email protected]>
- Re: st: Interpretation of Box-Cox Results
  - From: Nick Cox <[email protected]>
- Re: st: Interpretation of Box-Cox Results
  - From: Yuval Arbel <[email protected]>

Prev by Date: st: Esttab Lines in a RTF File
Next by Date: Re: st: generate Spell Counter or Duration Variable
Previous by thread: Re: st: Interpretation of Box-Cox Results
Next by thread: st: Improved commands, sample implementations. Any interest?
Index(es):
- Date
- Thread