Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Strange Behaviour When Selecting Levels For Factor Variables In Regression With i#


From   "Sarah Edgington" <[email protected]>
To   <[email protected]>
Subject   RE: st: Strange Behaviour When Selecting Levels For Factor Variables In Regression With i#
Date   Tue, 22 Jan 2013 09:17:51 -0800

Just wanted to quickly revisit this.
Now that I'm back at a computer with Stata 12 after being away for the
weekend, I can address it a bit more concretely.
Richard's output is exactly what I would expect to see.  My output looks
different from Richard's and I suspect Daniel's did too.  (Highlighting the
importance of including output, not just the syntax entered).  

I'm going to guess this is an issue with Stata not being fully up-to-date
rather than it being a corrupt install issue.  I get the same results across
two computers (one of which is not attached to any network and Stata was
installed on the two machines at different times).  One has a non-updated
version of Stata 12 and one has the Aug 8, 2012 update.  (Yes, in an ideal
world both computers would be completely updated; but since we don't have
admin rights on our own machines I'm sort of at the mercy of other people
for whom Stata updates are not the highest priority.)

As an aside, if I use the ib1.sex and ib0.sex syntax, I get the results I'd
expect (that is, the results Richard posted).

Here's what the regressions look like for me:

. regress bp i.sex   i.when  c.patient   i.when#c.patient

      Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  4,   235) =
21.29
       Model |  10881.7115     4  2720.42787           Prob > F      =
0.0000
    Residual |  30031.0843   235  127.791848           R-squared     =
0.2660
-------------+------------------------------           Adj R-squared =
0.2535
       Total |  40912.7958   239  171.183246           Root MSE      =
11.305

----------------------------------------------------------------------------
----
            bp |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
---------------+------------------------------------------------------------
----
         1.sex |  -24.86705   2.919115    -8.52   0.000    -30.61803
-19.11608
        2.when |  -4.519608   2.937149    -1.54   0.125    -10.30611
1.266899
       patient |   .3029286   .0471077     6.43   0.000     .2101214
.3957359
               |
when#c.patient |
            2  |  -.0094555   .0421309    -0.22   0.823     -.092458
.0735469
               |
         _cons |   150.5563    2.20753    68.20   0.000     146.2073
154.9054
----------------------------------------------------------------------------
----

. regress bp i1.sex    i.when  c.patient   i.when#c.patient

      Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  4,   235) =
0.00
       Model |           0     4           0           Prob > F      =
1.0000
    Residual |  40912.7958   235  174.097004           R-squared     =
0.0000
-------------+------------------------------           Adj R-squared =
-0.0170
       Total |  40912.7958   239  171.183246           Root MSE      =
13.195

----------------------------------------------------------------------------
----
            bp |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
---------------+------------------------------------------------------------
----
         1.sex |  -108.1599   7.189742   -15.04   0.000    -122.3245
-93.99531
        2.when |  -87.68457   6.504943   -13.48   0.000       -100.5
-74.86912
       patient |   .6568478   .0562303    11.68   0.000     .5460678
.7676277
               |
when#c.patient |
            2  |   1.365172   .1037678    13.16   0.000     1.160738
1.569606
               |
         _cons |   170.7907   2.593321    65.86   0.000     165.6815
175.8998
----------------------------------------------------------------------------
----

. regress bp i0.sex    i.when  c.patient   i.when#c.patient
note: 2.when#c.patient omitted because of collinearity

      Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  3,   236) =
28.48
       Model |  10875.2747     3  3625.09155           Prob > F      =
0.0000
    Residual |  30037.5212   236  127.277632           R-squared     =
0.2658
-------------+------------------------------           Adj R-squared =
0.2565
       Total |  40912.7958   239  171.183246           Root MSE      =
11.282

----------------------------------------------------------------------------
----
            bp |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
---------------+------------------------------------------------------------
----
         0.sex |   24.86705   2.913236     8.54   0.000     19.12778
30.60632
        2.when |  -5.091667   1.456466    -3.50   0.001    -7.961003
-2.222331
       patient |   .2982009   .0420504     7.09   0.000     .2153588
.381043
               |
when#c.patient |
            2  |          0  (omitted)
               |
         _cons |   125.9753   4.009148    31.42   0.000      118.077
133.8736
----------------------------------------------------------------------------
----




-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Richard Williams
Sent: Friday, January 18, 2013 10:48 PM
To: [email protected]; [email protected]
Subject: Re: st: Strange Behaviour When Selecting Levels For Factor
Variables In Regression With i#

At 06:45 PM 1/18/2013, [email protected] wrote:
>Hello,
>
>when i use indicator i with selecting level of a factor variable like 
>i1.varname to run a regression I get strange results.
>
>For example:
>
>sysuse blong,clear
>regress bp i.sex    i.when  c.patient   i.when#c.patient
>regress bp i1.sex  i.when  c.patient   i.when#c.patient
>regress bp i0.sex  i.when  c.patient   i.when#c.patient
>
>This regression is wihout sense but theoretically  it should estimate 
>the same model and should give same results except for variable sex 
>cause all I do is demand an indicator for a different level of a 
>2-level variable sex.
>But if I run these lines I get three regressions with three different 
>coefficients for the variable "when" and "patient" even I didnt change 
>anything that should be related to these variables.
>Whats wrong here?
>
>regards
>Daniel

First off, I think you mean bplong.

Second, it seems to work fine for me. Are you leaving something out? 
Could your version of Stata be corrupted or out of date? I'm sure the
problem is at your end because everything seems ok on mine. I'll just go
ahead and give all the output below.

. sysuse bplong.dta, clear
(fictional blood-pressure data)

. regress bp i.sex i.when c.patient i.when#c.patient

       Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  4,   235) =
21.29
        Model |  10881.7115     4  2720.42787           Prob > F      =
0.0000
     Residual |  30031.0843   235  127.791848           R-squared     =
0.2660
-------------+------------------------------           Adj R-squared =
0.2535
        Total |  40912.7958   239  171.183246           Root MSE      =
11.305

----------------------------------------------------------------------------
----
             bp |      Coef.   Std. Err.      t    P>|t|     [95% 
Conf. Interval]
---------------+--------------------------------------------------------
---------------+--------
          1.sex 
|  -24.86705   2.919115    -8.52   0.000    -30.61803   -19.11608
         2.when 
|  -4.519608   2.937149    -1.54   0.125    -10.30611    1.266899
        patient 
|   .3029286   .0471077     6.43   0.000     .2101214    .3957359
                |
when#c.patient |
             2  |  -.0094555   .0421309    -0.22   0.823     -.092458 
    .0735469
                |
          _cons 
|   150.5563    2.20753    68.20   0.000     146.2073    154.9054
----------------------------------------------------------------------------
----

. regress bp i1.sex i.when c.patient i.when#c.patient

       Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  4,   235) =
21.29
        Model |  10881.7115     4  2720.42787           Prob > F      =
0.0000
     Residual |  30031.0843   235  127.791848           R-squared     =
0.2660
-------------+------------------------------           Adj R-squared =
0.2535
        Total |  40912.7958   239  171.183246           Root MSE      =
11.305

----------------------------------------------------------------------------
----
             bp |      Coef.   Std. Err.      t    P>|t|     [95% 
Conf. Interval]
---------------+--------------------------------------------------------
---------------+--------
          1.sex 
|  -24.86705   2.919115    -8.52   0.000    -30.61803   -19.11608
         2.when 
|  -4.519608   2.937149    -1.54   0.125    -10.30611    1.266899
        patient 
|   .3029286   .0471077     6.43   0.000     .2101214    .3957359
                |
when#c.patient |
             2  |  -.0094555   .0421309    -0.22   0.823     -.092458 
    .0735469
                |
          _cons 
|   150.5563    2.20753    68.20   0.000     146.2073    154.9054
----------------------------------------------------------------------------
----

. regress bp i0.sex i.when c.patient i.when#c.patient

       Source |       SS       df       MS              Number of obs =
240
-------------+------------------------------           F(  4,   235) =
21.29
        Model |  10881.7115     4  2720.42787           Prob > F      =
0.0000
     Residual |  30031.0843   235  127.791848           R-squared     =
0.2660
-------------+------------------------------           Adj R-squared =
0.2535
        Total |  40912.7958   239  171.183246           Root MSE      =
11.305

----------------------------------------------------------------------------
----
             bp |      Coef.   Std. Err.      t    P>|t|     [95% 
Conf. Interval]
---------------+--------------------------------------------------------
---------------+--------
          0.sex 
|   24.86705   2.919115     8.52   0.000     19.11608    30.61803
         2.when 
|  -4.519608   2.937149    -1.54   0.125    -10.30611    1.266899
        patient 
|   .3029286   .0471077     6.43   0.000     .2101214    .3957359
                |
when#c.patient |
             2  |  -.0094555   .0421309    -0.22   0.823     -.092458 
    .0735469
                |
          _cons 
|   125.6893   4.214552    29.82   0.000     117.3862    133.9924
----------------------------------------------------------------------------
----

.



-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index