|Title||Logistic regression with grouped data|
|Author||William Sribney, StataCorp|
|Date||June 1998; updated January 2000; minor revisions May 2005|
It is easier to explain with an example. First, consider the following binary-outcome data:
|cases total x1 x2|
|1.||23 123 0 0|
|2.||12 234 0 1|
|3.||56 248 1 0|
|4.||81 390 1 1|
To use logistic and logit with fweights, the data need to be in the long form:
|w y x1 x2|
|1.||100 0 0 0|
|2.||23 1 0 0|
|3.||222 0 0 1|
|4.||12 1 0 1|
|5.||192 0 1 0|
|6.||56 1 1 0|
|7.||309 0 1 1|
|8.||81 1 1 1|
You can then run commands such as
. logistic y x1 x2 [fw=w]
To use blogit with the original data, you issue the command
. blogit cases total x1 x2
This command gives the same answer as the logistic command with the rearranged data.
As a general rule, Stata wants data in this long form, so it is best to transform to this long form right away and then work with Stata.
To do the transformation to long form, use the reshape command.
Here is how you do it for this example:
. gen w0 = total - cases /* w0 = counts of controls */ . rename cases w1 /* w1 = counts of cases */ . gen id = _n /* reshape needs a group id variable */ . reshape long w, i(id) j(y)
The categories (i.e., the suffixes of w) will appear in the variable y. The frequency weights will be given in the new variable w.
Then one can do
. logistic y x1 x2 [fw=w] . mlogit y <covariates> [fw=w] etc....