Stata: Data Analysis and Statistical Software
   >> Home >> Products >> Capabilities >> Factor variables
order stataorder stata

Factor variables

Stata handles factor (categorical) variables elegantly. You can prefix a variable with i. to specify indicators for each level (category) of the variable. You can put a # between two variables to create an interaction–indicators for each combination of the categories of the variables. You can put ## instead to specify a full factorial of the variables—main effects for each variable and an interaction. If you want to interact a continuous variable with a factor variable, just prefix the continuous variable with c.. You can specify up to eight-way interactions.

We run a linear regression of cholesterol level on a full factorial of age group and whether the person smokes along with a continuous body mass index (bmi) and its interaction with whether the person smokes.

output

We could have used parenthesis binding, to type the same model more briefly:

. regress cholesterol smoker##(agegrp c.bmi)

Base levels can be changed on the fly: i.agegrp uses the default base level of 1, whereas b3.agegrp makes 3 the base level.

The level indicator variables are not created in your dataset, saving lots of space.

Factor variables are integrated deeply into Stata’s processing of variable lists, providing a consistent way of interacting with both estimation and postestimation commands.

Bookmark and Share 
Stata 12
Overview: Why use Stata?
Stata/MP
Capabilities
Overview
Sample session
User-written commands
New in Stata 12
Supported platforms
Which Stata?
Technical support
User comments
Like us on Facebook Follow us on Twitter Follow us on LinkedIn Google+ Watch us on YouTube
Follow us
© Copyright 1996–2013 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index   |   View mobile site