# st: AW: Adjust after regression involving categorical variables

Maybe this FAQ will assist you:

I am not sure what the question really is. You are conducting two different
prediction exercises, and they lead to different outcomes, just as you would
expect: In the first one, Stata holds -rep78- at its values in the dataset,
in the second one it assigns the mean to them. Note that these means are the
proportions that emerge for - proportion rep78 -...

Dear all,

I am struggling to understand the -adjust- command after regression
involving categorical variables.  My aim in using -adjust- is to obtain
the predicted values adjusted for the categorical variable, but I am not
explicitly interested in the categorical variable and so do not want it
appearing in the -by()- option of -adjust-.  I have been unable to find
any examples of this kind of use of -adjust-.  I have reproduced my query
using the auto dataset below. I am using Stata 10.1 SE.

sysuse auto, clear
descr

** just for this example, assume that rep78 is categorical
xi: regress price weight turn i.rep i.foreign

** output
price           Coef.           Std. Err.               t       P>t [95%
Conf.   Interval]

weight          4.243125        .6699849        6.33    0.000   2.903407
5.582842
turn            -208.6987       125.9326        -1.66   0.103   -460.5164
43.11914
_Irep78_2       822.0914        1691.818        0.49    0.629   -2560.907
4205.09
_Irep78_3       710.281         1560.7          0.46    0.651   -2410.531
3831.093
_Irep78_4       341.2531        1631.858        0.21    0.835   -2921.848
3604.355
_Irep78_5       876.4049        1740.224        0.50    0.616   -2603.387
4356.197
_Iforeign_1     3239.838        859.1453        3.77    0.000   1521.871
4957.805
_cons           -32.54137       4097.528        -0.01   0.994   -8226.054
8160.972

** want the predicted values by foreign - not specifically interested in
rep78 but wanted to adjust for it, but I am unsure as to how to treat
rep78
** option 1 - set continuous values to mean but leave rep78 as is

** output
----------------------
Car type |             xb
----------+-----------
Domestic |     5164.18
Foreign |     8390.29
----------------------

** However, you see that 8390.29-5164.18=3226.11, and not 3239.838 as
predicted by the model above

** option 2 - treat dummies created by -xi- as continuous, and also set
them to their mean
adjust weight turn  _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5, by(foreign)

** output
----------------------
Car type |             xb
----------+-----------
Domestic |      5160.01
Foreign |      8399.84
----------------------

You see that the final -adjust- command gives 8399.84-5160.01=3239.83, as
given by the regression model above.  So it appears that the second
treatment of the categorical gives the 'correct' predictions.  However, I
am struggling to interpret exactly what this means for rep78, and does it
make sense to set variables that are 0/1 to their mean?

I would be extremely grateful for any assistance with this.

Many thanks,

Gillian

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```