Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Dummy variable p value question


From   "Joseph Coveney" <[email protected]>
To   "Statalist" <[email protected]>
Subject   Re: st: Dummy variable p value question
Date   Mon, 1 Dec 2008 10:39:35 +0900

Jimmy Verner wrote:

Suppose you have an interval dependent variable Y, an interval
independent variable B and a nominal variable C.  C has four
categories, C1, C2, C3 and C4.  C is coded by four dummy variables, C1
through C4, with the value 1 when "in play" and the value 0 otherwise.

One may regress Y on B and C1 through C4 by dropping the constant:

Model A:  reg Y B C1 C2 C3 C4, nocon

Alternatively, one may keep the constant but drop a category to avoid
falling into the dummy variable trap.  The constant replaces the
dropped category:

Model B:  reg Y B C1 C2 C3

If what I have said is correct, why are the p values different for C1
through C3 between the two models?  And should not the p value for C4
in Model A be the same as for the constant in Model B?

--------------------------------------------------------------------------------

First question:

Regression coefficients in the first parameterization of the model are the
means of Y for each category adjusted for B.  The regression coefficients in
the second parameterization of the model are the differences between means
of Y for categories 1 through 3 and the mean for category 4, all adjusted
for B.  (See the coeffients below.)  The null hypotheses tested by the first
parameterization are that the adjusted means are equal to zero.  Those in
the second are that the adjusted means are equal to that for category 4.

Second question:

Yes--see the results below.

Joseph Coveney

sysuse auto, clear
rename mpg Y
rename weight B
recode rep78 (5=4)
tabulate rep78, generate(C)
regress Y B C1 C2 C3 C4, noconstant
regress Y B C1 C2 C3

Results:

. regress Y B C1 C2 C3 C4, noconstant

     Source |       SS       df       MS              Number of obs =
69
-------------+------------------------------           F(  5,    64) =
515.31
      Model |   32800.265     5  6560.05299           Prob > F      =
0.0000
   Residual |  814.735035    64  12.7302349           R-squared     =
0.9758
-------------+------------------------------           Adj R-squared =
0.9739
      Total |       33615    69  487.173913           Root MSE      =
3.5679

------------------------------------------------------------------------------
          Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
          B |  -.0057832   .0005962    -9.70
0.000    -.0069744   -.0045921
         C1 |   38.92806    3.12755    12.45   0.000     32.68006
45.17606
         C2 |   38.52056   2.364302    16.29   0.000     33.79732
43.24379
         C3 |   38.51226   2.072076    18.59   0.000     34.37281
42.65171
         C4 |   39.22498    1.72017    22.80   0.000     35.78854
42.66141
------------------------------------------------------------------------------

. regress Y B C1 C2 C3

     Source |       SS       df       MS              Number of obs =
69
-------------+------------------------------           F(  4,    64) =
29.96
      Model |  1525.46786     4  381.366966           Prob > F      =
0.0000
   Residual |  814.735035    64  12.7302349           R-squared     =
0.6519
-------------+------------------------------           Adj R-squared =
0.6301
      Total |   2340.2029    68  34.4147485           Root MSE      =
3.5679

------------------------------------------------------------------------------
          Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
          B |  -.0057832   .0005962    -9.70
0.000    -.0069744   -.0045921
         C1 |   -.296918   2.621481    -0.11   0.910    -5.533929
4.940093
         C2 |  -.7044196   1.483296    -0.47   0.636    -3.667644
2.258805
         C3 |  -.7127189   1.003684    -0.71   0.480    -2.717809
1.292371
      _cons |   39.22498    1.72017    22.80   0.000     35.78854
42.66141
------------------------------------------------------------------------------


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index