Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Winning and losing

From   "Nick Cox" <>
To   <>
Subject   st: RE: Winning and losing
Date   Fri, 21 Nov 2003 11:56:53 -0000

Clive Nicholas

> I'm looking to dummy-code (0/1) which party won
> the ith seat in the jth election. Since this is Blighty, 
> there can only be
> one winner per district, but since that n=3452, that's an 
> awful lot of
> outcomes to code manually! There are five outcome 
> categories: conwin;
> labwin; ldmwin; natwin; and othwin.
> Now here's the rub: since it's plurality-rule, I need to 
> tell Stata to
> code, say, conwin=1 and labwin-othwin=0 if, say, for 
> district X: conpc=35;
> labpc=31; ldmpc=16; natpc=17; othpc=1. I've tried several 
> generates, such
> as:
> -g conwin=0 if conpc > labpc & ldmpc & natpc & othpc-
> -replace conwin=1 if conpc < labpc & ldmpc & natpc & othpc-,
> but, although Stata did not return errors at *any* of my 
> 'solutions', each
> kept producing multiple, rather than unique, 1's for each 
> case (or n).
> Any ideas as to where I'm going wrong?

I'm going to ignore the possibilities of ties for first 

Suppose, contrary to fact, that there were just two 
parties. Then -conwin- would be 1 if the 
Conservatives won and 0 if Labour won, i.e. 

gen conwin = conpc > labpc 

or, more long-windedly, 

gen conwin = 1 if conpc > labpc 
replace conwin = 0 if conpc < labpc 

which has 0 and 1 reversed from what you have. 

When you bring in other parties, note that your extra 

& ldmpc & natpc & othpc

are read by Stata as 

& (ldmpc != 0) & (natpc != 0) & (othpc != 0) 

which is not what you want. Perhaps you are 
guessing that Stata will interpret 

& ldmpc & natpc & othpc 

as if it meant 

& (conpc > ldmpc) & (conpc > natpc) & (conpc > othpc)

but that's not the way Stata works. 

So, in short, you went wrong (1) because 0 and 1 are the wrong 
way round and (2) you're misinterpreting how 
compound conditions are handled. 

There's some context at

Now Matt Dobra has already given another solution. 
Here's another, which is not better, but nevertheless 
shows a Stataish approach useful in many other problems. 

First, map from your names to others 

foreach p in con lab ldm nat oth { 
	rename `p'pc pc`p' 

Second, -reshape- to long 

reshape long pc, i(district) j(party) string 


bysort district (pc) : gen win = _n == _N 

generates your -win- variable collectively. 

This works as follows: 

bysort district (pc) : 

sorts the winning party to the end 
of each block of observations, 
and in that context 

	gen win = _n == _N 

puts win = 1 in the last observation and win = 0 in 
the others in each block. 

Fourth, -reshape- back: 

reshape wide pc win, i(district) j(party) string 

Fifth, if you prefer your names, map backwards: 

foreach p in con lab ldm nat oth { 
	rename pc`p' `p'pc
	rename win`p' `p'win  

So, given appropriate variable names, the code 
boils down to 

reshape long pc , i(district) j(party) string
bysort district (pc) : gen win = _n == _N 
reshape wide pc win, i(district) j(party) string

If you had a copy, [R] reshape would be 
a place to look. As it is, there are still 
examples you can look at in the on-line help 
and at
Some examples are very close to this problem. 

There was a tutorial on -by:- in Stata Journal 2(1) 
86-102 (2002). 


P.S. I've supposed in all this that the data 
concern a single election. If there were 
several, then I think something like this would 
work (assuming an extra variable -year-): 

reshape long pc , i(district year) j(party) string
bysort district year (pc) : gen win = _n == _N 
reshape wide pc win, i(district year) j(party) string

The -reshape- brings real bonus whenever the "obvious"
wide data structure turns out to be awkward for some 
manipulation (althugh it can be avoided, as in Matt 
Dobra's solution, in some cases by using -egen- 

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index