Search
>> Home >> Products >> Stata 13 >> Factor variables and value labels

## Factor variables and value labels

### Highlights

• Value labels on factor variables now appear in estimation output
• Value labels appear on contrasts, margins, and pairwise comparisons

### Show me

A factor variable might be

• attitude measured on a scale of 1 to 5,
• agegrp recorded 1 to 4, 1 being 20-30, 2 being 31-40, ...
• region being 1 (North East), 2 (North Central), ...

When you fit a model, Stata allows factor-variable notation. You can type

i.attitude

to obtain the levels of factor variable attitude.

i.attitude#c.age

to obtain the levels of attitude interacted with continuous variable age

i.attitude##c.age

to mean i.attitude age i.attitude#c.age

i.attitude#i.agegrp

to obtain the levels of attitude interacted with the levels of agegrp

i.attitude##i.agegrp

to mean i.attitude i.agegrp i.attitude#i.agegrp

i.attitude#i.agegrp#i.region

to obtain the levels of attitude interacted with the levels of agegrp interacted with the levels of region

i.attitude##i.agegrp##i.region

to mean
	i.attitude  i.agegrp  i.region
i.attitude#i.agegrp   i.attitude#i.region
i.agegrp#i.region
i.attitude#i.agegrp#i.region 
i.(attitude agegrp)

to mean i.attitude i.agegrp

i.(attitude agegrp)##i.region

to mean i.attitude##i.region i.agegrp##i.region

and so on.

Stata also has value labels. You might type

. label define regions 1 "North East"  2 "North Central"
3 "South"       4 "West"

. label values region regions


In Stata 13, when you fit a model using factor-variable notation, the labels appear in the output:

. regress  y  i.attitude i.agegrp i.region

Source         SS       df       MS              Number of obs =     400

F( 10,   389) =   22.60

Model    2668.04079    10  266.804079           Prob > F      =  0.0000

Residual    4592.44366   389  11.8057678           R-squared     =  0.3675

Adj R-squared =  0.3512

Total    7260.48445   399  18.1967029           Root MSE      =   3.436

y        Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

attitude

disagree       1.27901   .5617435     2.28   0.023     .1745764    2.383443

neutral      1.466543   .5304032     2.76   0.006     .4237268    2.509358

agree      2.063136   .5326997     3.87   0.000     1.015805    3.110467

strongly agree      3.550927   .5801312     6.12   0.000     2.410343    4.691512

agegrp

31-40      2.114168   .4868806     4.34   0.000     1.156921    3.071414

41-50      3.970627   .4866537     8.16   0.000     3.013826    4.927428

50+      5.990408   .4869362    12.30   0.000     5.033052    6.947764

region

North Central       .673176   .4913976     1.37   0.172    -.2929515    1.639304

South     -1.366099    .491862    -2.78   0.006     -2.33314   -.3990588

West     -1.477714   .4890703    -3.02   0.003    -2.439266   -.5161623

_cons     8.411983   .5760115    14.60   0.000     7.279498    9.544468



Value labels are also used by Stata's postestimation commands. Below we use pwcompare to compare y values for each pairing of the age groups:

. pwcompare agegrp

Pairwise comparisons of marginal linear predictions

Margins      : asbalanced

Unadjusted

Contrast   Std. Err.     [95% Conf. Interval]

agegrp

31-40 vs 20-30      2.114168   .4868806      1.156921    3.071414

41-50 vs 20-30      3.970627   .4866537      3.013826    4.927428

50+ vs 20-30      5.990408   .4869362      5.033052    6.947764

41-50 vs 31-40      1.856459   .4869484      .8990793    2.813839

50+ vs 31-40       3.87624   .4870898      2.918582    4.833898

50+ vs 41-50      2.019781   .4878207      1.060686    2.978876



For instance, 31–40 year olds, have an average value of y that is 2.11 higher than that of 20–30 year olds, controlling for the other covariates in the model.

### Show me more

To learn more about factor variables, see the manual entry.

To learn more about pwcompare, see its manual entry.

See New in Stata 14 for more about what was added in Stata 14.