Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: Creating dummy variables

 From Michael Betz To "statalist@hsphsun2.harvard.edu" Subject st: Creating dummy variables Date Wed, 16 Nov 2011 02:44:50 +0000

```Hi all,

I have two categorical variables "fips1" and "fips2" that record the US county of the observation. For each "fips1" there are many "fips2" counties as below

fips1	fips2
1001	1073
1001	1021
1001	1101
1003	12031
1003	1099

I need to create dummy variables for each county in "fips1" and "fips2" and then create variables representing the difference between the two dummy variables as below:

fips1	fips2	dum1_1	dum1_2	dum2_1	dum2_2	dum2_3	dum2_4	1_1-2_1	1_1-2_2	1_1-2_3
1001	1003	1		0		1		0		0		0		0		1		1
1001	1021	1		0		0		1		0		0		1		0		1
1001	1101	1		0		0		0		1		0		1		1		0
1003	1021	0		1		0		1		0		0		0		0		0
1003	1001	0		1		0		0		0		1		0		0		0

One added constraint is that each of "fips1" and "fips2" creates 3,000 dummies, so Stata cannot hold variables representing the difference between all pairs of dummy variables. I need to only calculate the difference in dummies for the pairs that in the data (i.e. according to the example above I would not need the difference between the dummies for "fips1"=1001 and "fips2"=1001 because that pair doesn't exist in my data)

I've been thinking all day trying to come up with a solution, but to no avail. I appreciate and help or suggestions.

Thanks,
Mike

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```