** We'll use data from the National Health and Nutrition Examination Survey (NHANES) for our examples webuse nhanes2 ** Before we fit the model, let's investigate the variables using -codebook- codebook bmi age female ** Now we can fit the model regress bmi age female ** We would now like to include -region- in the model, let's take a look at this variable codebook region ** For example, to add -region- to our model we use regress bmi age i.female i.region ** We can tell Stata to show the base categories for our factor variables set showbaselevels on ** The i. operator can be applied to many variables at once: regress bmi age i.(female region) ** We can use region=3 as the base class on the fly: regress bmi age i.female b3.region ** We can use the most prevalent category as the base regress bmi age i.female b(freq).region ** Factor variables can be distributed across many variables regress bmi age b(freq).(female region) ** The base category can be omitted (with some care here) regress bmi age i.female bn.region, noconstant ** We can also include a term for region=4 only regress bmi age i.female 4.region ** For example, to fit a model that includes main effects for -age-, -female-, and -region-, as well as the interaction of -female- and -region- regress bmi age female##region ** To see all the omitted terms we can add the -allbaselevels- option regress bmi age female##region, allbaselevels ** Here is our model with an interaction between -age- and -region- regress bmi c.age##region i.female ** For example, we can interact -age- with serum vitamin c levels (-vitaminc-) regress bmi c.age##c.vitaminc i.female i.region ** For example, a model that includes both -age- and -age- squared regress bmi c.age##c.age i.female i.region ** For example, here we fit a model for -bmi- using a model that includes the three-way interaction of continuous variables -age- and -vitaminc- and categorical variable -female- regress bmi c.age##c.vitaminc##female ** Lets begin by running a model with main effects for -age-, -female- and -region-, and the interaction of -female- and -region- regress bmi age female##region ** To see these names we can replay the model and show the coefficient legend regress, coeflegend ** To perform a joint test of the null hypothesis that the coefficients for the levels of -region- are all equal to 0 test 2.region 3.region 4.region ** To test that all of the coefficients associated with the interaction of -female- and -region- we would need to give the full name of all the coefficients test 1.female#2.region 1.female#3.region 1.female#4.region ** So we can perform joint tests with less typing, for example testparm i.region#i.female ** To test the coefficients associated with the interaction of -female- and -region- we need to store our model results. The name is arbitrary, we'll call them -m1- estimates store m1 ** Now we can rerun our model without -region- regress bmi age i.female i.region ** Now we store the second set of estimates estimates store m2 ** And use the -lrtest- command to perform the likelihood ratio test lrtest m1 m2 ** We'll restore the results from -m1- estimates restore m1 ** -test- can also be used to the equality of coefficients test 3.region#1.female = 4.region#1.female ** For example, to obtain the difference in coefficients lincom 3.region#1.female - 4.region#1.female ** For example comparing regions separately for men and women contrast region@female, effects ** To apply Bonferroni's adjustment to our previous -contrast- contrast region@female, effects mcompare(bonferroni) ** To obtain the average predicted value of -bmi- margins ** To obtain the average predicted value of -bmi- at different values of -region- margins region ** We can obtain -margins- for multiple variables margins region female ** Or we can oobtain predicted values of -bmi- at each combination of -region- and -female- margins region#female ** [] marginsplot ** The -over()- option allows us to obtain predictions separately for each group, for example margins, over(female) ** For this set of examples, we'll fit a model that includes an interaction between the continuous variable -age- and the categorical variable -region- regress bmi c.age##region i.female ** Let's take a look at how the coefficients are stored regress, coeflegend ** As before, we can test the null hypothesis that all of the coefficients associated with the interaction of -age- and -region- are equal to 0 using -testparm- testparm c.age#i.region ** For example we might want to test whether the slope of -age- is significantly different in the south (region=3) versus the west (region=4) test 3.region#c.age = 4.region#c.age ** We can use -lincom- to estimate the slope of -age- for the south (region=3) lincom c.age + 3.region#c.age ** We can also use -margins- with the -dydx()- option to calculate the slope of -age- for each -region- margins region, dydx(age) ** For example, the predicted value of -bmi- at each level of -region- setting age=20 margins region, at(age=20) vsquish ** The -at()- option accepts -numlists- so we aren't restricted to a single value of -age- margins region, at(age=(20(25)70)) vsquish ** And we can plot the results marginsplot ** The confidence intervals can make the graph appear messy; we can suppress them marginsplot, noci ** To obtain tests of differences between levels of -region- at each level of -age- margins region, at(age=(20(10)70)) vsquish contrast ** For example, to obtain predicted values for each -region- using the observed values of -female- and -age- in that -region- margins, over(region) ** Before we fit the model, let's take a closer look at -vitaminc- summ vitaminc, detail ** Now lets fit the model regress bmi c.age##c.vitaminc i.female i.region ** We can replay the model using -coeflegend- regress, coeflegend ** We can use -lincom- to calculate the slope for -vitaminc- when age=49 (it's median) lincom vitaminc + c.vitaminc#c.age*49 ** We could also calculate the slope of -age- when vitaminc=1 (it's median) lincom age + c.vitaminc#c.age*1 ** -margins- can produce estimates of the slopes for a range of values margins, dydx(vitaminc) at(age=(20(10)70)) vsquish ** We can graph the slopes of -vitaminc- across -age- marginsplot, yline(0) ** Specifying multiple variables in the -at()- option results in predictions at each combination of values margins , at(age=(20(25)70) vitaminc=(.2(.6)2)) vsquish ** [] marginsplot ** We can select which variable appears on the x-axis using the -xdimension()- option marginsplot, xdimension(vitaminc) ** We'll start by fitting a model that includes -age- and -age- squared regress bmi c.age##c.age i.female i.region ** Here we predict values of -bmi- at different values of -age- margins, at(age=(20(10)70)) vsquish ** And graph the predictions marginsplot ** To do so we'll include -age- in both the -dyed()- and -at()- options margins, dydx(age) at(age=(20(10)70)) vsquish ** The same process can be used with higher order polynomials, here we add a cubic term for -age- regress bmi c.age##c.age##c.age i.female i.region ** As before we can predict slopes at specified values of -age- margins, dydx(age) at(age=(20(10)70)) vsquish ** And we can easily graph this as well marginsplot ** [] marginsplot, addplot(scatter bmi age, below /// legend(order(3 "Observed Values" 2 "Predictions")) /// xlabel(20(9)74)) ** Let's run a simple model to demonstrate regress bmi i.region margins region ** [] marginsplot, recast(scatter) ** [] marginsplot, recast(bar) plotopts(barwidth(.9))