Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Multicollinearity Problem in Stata


From   FU Youyan <s1150901@sms.ed.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Multicollinearity Problem in Stata
Date   Mon, 29 Jul 2013 21:14:52 +0100

Dear Yuval,

Thank you very much for this answer, it is quite helpful.  I have a followed up question:
The r_ew and r_ow are two types of investment return in my research ( they are continuous variable rather than dummy), what I want to test is the impact of these two returns on investors' future behavior. In other words, I want to know how investors weight these two types of return. Therefore, I have to include both of the returns into my regression. In the regression with constant but omitting r_ew, the coefficient  of r_ow is significantly negative (t-value=-3.30). However,  in the regression without constant but including r_ew, the coefficient of r_ow is significantly positive (t-value=2.20). So, I would like to know which result is more reliable? 

Best wishes,
Youyan
________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Yuval Arbel [yuval.arbel@gmail.com]
Sent: 29 July 2013 17:58
To: statalist
Subject: Re: st: Multicollinearity Problem in Stata

Dear FU,

This outcome is not strange at all. I believe what you encountered is
known in econometrics as "the dummy variable trap":

I believe that r_ew+r_ow=constant. Consequently - when you run the
model with a constant - you get a perfect colinearity with the
constant term. But when you omit the constant - the problem is solved.

In fact you can make use of these two specifications. Consider the
following exercise. Lets say that w is the wage male=0 for female and
1 for male, and female=1 for female and 0 for male. if the average
wage is 1200 for male and 1000 for female - and you run the model
without the constant, you will get:

w(hat)=1200*male+1000*female

But if you omit male and use constant (in order to avoid the dummy
variable trap), you get

w(hat)=1200-200*female

The second specification is more common because it permits you to test
whether wage differences across gender are significant

On Mon, Jul 29, 2013 at 9:10 AM, FU Youyan <s1150901@sms.ed.ac.uk> wrote:
> Dear Statalist users,
>
> I am encountering a strange multicollinearity problem when I conduct regression using Stata. The problem is illustrated below. I will VERY appreciate if any of you can answer my question.
>
>
> *****************************************************************************************************
> note: r_ew omitted because of collinearity
>
> Linear regression                                      Number of obs =     159
>                                                        F(  3,   155) =   73.74
>                                                        Prob > F      =  0.0000
>                                                        R-squared     =  0.4900
>                                                        Root MSE      =  .88944
>
> ------------------------------------------------------------------------------
>                  |                   Robust
>        n2_ln  |      Coef.      Std. Err.          t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>         r_ow |  -6.150886   1.861984    -3.30   0.001    -9.829026   -2.472746
>         r_ew |          0       (omitted)
>         lnnc |   .1853104   .0502188     3.69   0.000     .0861089    .2845119
>        n1_ln |   .2328174   .0912362     2.55   0.012     .0525905    .4130443
>        _cons |   1.945399   .5489629     3.54   0.001     .8609843    3.029813
> ------------------------------------------------------------------------------
>
> In the above regression table, r_ew is omitted due to the perfectly negative collinearity between r_ow and r_ew.
>
> (Correlation table is showed below). The relationship between these two variables is r_ow+r_ew=0.2407656,so there exists perfect collinearity.
>
>
>              |       n2_ln     r_ow     r_ew       lnnc        n1_ln
> -------------+---------------------------------------------
>        n2_ln |   1.0000
>        r_ow |  -0.6565   1.0000
>        r_ew |   0.6565  -1.0000   1.0000
>        lnnc |   0.4587    -0.4285   0.4285   1.0000
>        n1_ln |   0.6419  -0.8468   0.8468   0.4103   1.0000
>
> However, the variable of r_ew is not omitted when I run the exactly same regression but without intercept.
>
>
> Linear regression                                      Number of obs =     159
>                                                        F(  4,   155) =  441.13
>                                                        Prob > F      =  0.0000
>                                                        R-squared     =  0.8909
>                                                        Root MSE      =  .88944
>
> ------------------------------------------------------------------------------
>              |                      Robust
>        n2_ln |      Coef.      Std. Err.         t          P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>         r_ow |   1.929168   .8763971     2.20   0.029     .1979442    3.660391
>         r_ew |   8.080053   2.280073     3.54   0.001     3.576027    12.58408
>         lnnc |   .1853104   .0502188     3.69   0.000     .0861089    .2845119
>        n1_ln |   .2328174   .0912363     2.55   0.012     .0525905    .4130443
> ------------------------------------------------------------------------------
>
> My question is why Stata does not omit r_ew when intercept term is excluded? And whether the regression result without intercept is valid?
>
>
> Thanks for your help.
> Youyan
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/



--
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street,
Haifa 33031, Israel
e-mail1: yuval.arbel@carmel.ac.il
e-mail2: yuval.arbel@gmail.com
You can access my latest paper on SSRN at:  http://ssrn.com/abstract=2263398
You can access previous papers on SSRN at: http://ssrn.com/author=1313670
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index