Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: computing odds ratios for models with interraction terms

From   "Fitzmaurice, Ann E." <>
To   "" <>
Subject   RE: st: computing odds ratios for models with interraction terms
Date   Fri, 6 May 2011 10:44:40 +0100

Hi Maarten

Thanks for this, I normally use spss, but occasionally use stata because of its greater depth, should really transfer over to stata completedly

Will give the below a go

Thanks again


-----Original Message-----
From: [] On Behalf Of maarten buis
Sent: 06 May 2011 08:40
Subject: Re: st: computing odds ratios for models with interraction terms

On Thu, May 5, 2011 at 11:38 PM, Fitzmaurice, Ann E. wrote:
> I use the following for example
> Logistic outcome i.varA  i.varB b i.varA#i.varB  other variables
> And I obtain the odds ratios for var a var b and the interaction terms
> Var a has four levels and var b has 3 levels
> What I would like to do , is to generate the odds ratios for the  12 cells in the table var a by var b
> Is there a way of doing this in stata, I would also like the confidence intervals

You can quite easily do so with the new factor variable notation, the
trick is to use # instead of ##. However you won't be able to fill the
entire 4 by 3 table with odds ratios as you need to define one of
these cells as your reference category. An odds ratio is a comparison
of groups, so it needs to have a group to compare it with. this means
that one of your cells will be fixed at 1. This is what I have done in
the first -logit- model. The reference category is divorced women with
less than highschool. The odds ratio for that category is reported as
(base). Alternatively you can replace it with 1 and leave the standard
error empty, as the odds ratio for that group is the odds of attaining
a high occupation within that group divided by the odds of attaining a
high occupation within that group, which is trivially equal to 1.

The coefficient of baseline gives you the baseline odds, that is the
odds of attaining a high job for divorced women with less than high
school education. Within that group we expect to find .12 women with a
high job for every women who does not have a high job. All odds ratios
tell you by what factor the group that belongs to that odds ratio
differs from this reference category. For example the odds increases
by a factor 1.78 [i.e. (1.78 - 1)*100% = 78%] when a divorced woman
gets a high school diploma.

You can get coefficients for all your cells, but than you will get
odds not odds ratios. The trick is to precede the categorical
variables with ibn. instead of i., meaning that you don't want Stata
to leave out the reference category, and you must make sure there is
no constant estimated, i.e. specify the -nocons- option and leave out
the variable baseline. This is what I did in the second -logit- model.
So for divorced women without high school we find that there are .12
women with a high occupation per woman without a high occupation,
which is exactly the same as we found above. For divorced women with
high school there are .22 women with a high occupation per woman
without a high occupation. Notice that if we divide those two odds [
-di .2193878 /.1235955-] we get exactly the odds ratio from our first
-logit- model. This should be true, as the two models are just two
different ways of displaying the same model.

*------------------------- begin example -----------------------
// data preparation
sysuse nlsw88, clear

gen byte marst = never_married + 2*married
label define marst 1 "divorced/widowed" ///
                   2 "never married"    ///
                   3 "married"
label value marst marst
label variable marst "marital status"

gen byte edcat = cond(grade <  12, 1,     ///
                 cond(grade == 12, 2, 3)) ///
                 if grade < .
label define edcat 1 "less than high school" ///
                   2 "high school"           ///
                   3 "more than high school"
label value edcat edcat
label variable edcat "education in categories"

gen byte high_occ = occupation < 3 ///
                    if occupation < .
label define high_occ 1 "proffesionals and managers" ///
                      2 "other"
label value high_occ high_occ

gen byte baseline = 1

// get all odds ratios,
// reference = divorced and less than highschool
logit high_occ i.marst#i.edcat baseline, ///
      or nocons baselevels

// get all odds
logit high_occ ibn.marst#ibn.edcat, nocons or
*------------------------ end example ----------------------------
(For more on examples I sent to the Statalist see: )

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen

*   For searches and help try:

The University of Aberdeen is a charity registered in Scotland, No SC013683.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index