# st: AW: Adjust after regression involving categorical variables

 From "Martin Weiss" To Subject st: AW: Adjust after regression involving categorical variables Date Wed, 28 Oct 2009 13:18:58 +0100

```<>

Maybe this FAQ will assist you:

I am not sure what the question really is. You are conducting two different
prediction exercises, and they lead to different outcomes, just as you would
expect: In the first one, Stata holds -rep78- at its values in the dataset,
in the second one it assigns the mean to them. Note that these means are the
proportions that emerge for - proportion rep78 -...

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von
Gillian.Frost@hsl.gov.uk
Gesendet: Mittwoch, 28. Oktober 2009 12:44
An: statalist@hsphsun2.harvard.edu
Betreff: st: Adjust after regression involving categorical variables

Dear all,

I am struggling to understand the -adjust- command after regression
involving categorical variables.  My aim in using -adjust- is to obtain
the predicted values adjusted for the categorical variable, but I am not
explicitly interested in the categorical variable and so do not want it
appearing in the -by()- option of -adjust-.  I have been unable to find
any examples of this kind of use of -adjust-.  I have reproduced my query
using the auto dataset below. I am using Stata 10.1 SE.

sysuse auto, clear
descr

** just for this example, assume that rep78 is categorical
xi: regress price weight turn i.rep i.foreign

** output
price           Coef.           Std. Err.               t       P>t [95%
Conf.   Interval]

weight          4.243125        .6699849        6.33    0.000   2.903407
5.582842
turn            -208.6987       125.9326        -1.66   0.103   -460.5164
43.11914
_Irep78_2       822.0914        1691.818        0.49    0.629   -2560.907
4205.09
_Irep78_3       710.281         1560.7          0.46    0.651   -2410.531
3831.093
_Irep78_4       341.2531        1631.858        0.21    0.835   -2921.848
3604.355
_Irep78_5       876.4049        1740.224        0.50    0.616   -2603.387
4356.197
_Iforeign_1     3239.838        859.1453        3.77    0.000   1521.871
4957.805
_cons           -32.54137       4097.528        -0.01   0.994   -8226.054
8160.972

** want the predicted values by foreign - not specifically interested in
rep78 but wanted to adjust for it, but I am unsure as to how to treat
rep78
** option 1 - set continuous values to mean but leave rep78 as is

** output
----------------------
Car type |             xb
----------+-----------
Domestic |     5164.18
Foreign |     8390.29
----------------------

** However, you see that 8390.29-5164.18=3226.11, and not 3239.838 as
predicted by the model above

** option 2 - treat dummies created by -xi- as continuous, and also set
them to their mean
adjust weight turn  _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5, by(foreign)

** output
----------------------
Car type |             xb
----------+-----------
Domestic |      5160.01
Foreign |      8399.84
----------------------

You see that the final -adjust- command gives 8399.84-5160.01=3239.83, as
given by the regression model above.  So it appears that the second
treatment of the categorical gives the 'correct' predictions.  However, I
am struggling to interpret exactly what this means for rep78, and does it
make sense to set variables that are 0/1 to their mean?

I would be extremely grateful for any assistance with this.

Many thanks,

Gillian

------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential information intended
for the addressee(s) only. If this message was sent to you in error,
you must not disseminate, copy or take any action in reliance on it and
we request that you notify the sender immediately by return email.

Opinions expressed in this message and any attachments are not
necessarily those held by the Health and Safety Laboratory or any person
connected with the organisation, save those by whom the opinions were
expressed.

Please note that any messages sent or received by the Health and Safety
Laboratory email system may be monitored and stored in an information
retrieval system.
------------------------------------------------------------------------
Think before you print - do you really need to print this email?
------------------------------------------------------------------------

------------------------------------------------------------------------
Scanned by MailMarshal - Marshal's comprehensive email content security
------------------------------------------------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```