[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Check if variables have same value |

Date |
Fri, 27 Jul 2007 17:59:07 +0100 |

Tagging in what sense? How do you tell which soldiers are out of step? Majority vote? How do you split a 50:50 agreement? Three variables say "Stata" and three say "SAS"? (No, that's an easy one to identify which are incorrect.) (You didn't mention strings; I guess you don't care about strings.) One way forward is to note that the row maximum and the minimum will be different if any values differ. See -egen, rowmin()- or -egen, rowmax()-. Another is that the row median will identify a majority in all cases where it exists. See -egenmore- for a row median function. See http://www.stata.com/support/faqs/stat/medians.html for a fuller story. In the case where a median is the midpoint between two values, then contrariwise, all values disagree with it. You might still have a plurality somewhere. All these functions ignore missings in the way you want. Yet another story is to -reshape-. Then see also http://www.stata.com/support/faqs/data/distinct.html http://www.stata.com/support/faqs/data/diff.html Nick n.j.cox@durham.ac.uk Friedrich Huebler > I would like to compare a set of variables and tag those that do not > contain the same values. Missing values should be ignored. -egen > newvar = diff(varlist)- is not an option because it does not skip > missing values. The last command in the example below works but it > becomes impractical with a longer list of variables. > > . sysuse auto > . gen mpg2 = mpg if foreign==0 > . gen mpg3 = mpg if foreign==1 > . replace mpg3 = mpg+1 if rep78==2 > . egen tag = diff(mpg mpg2 mpg3) > . gen tag2 = (mpg!=mpg2 & mpg<. & mpg2<. | mpg!=mpg3 & mpg<. & mpg3<. > | mpg2!=mpg3 & mpg2<. & mpg3<.) > > The -egen- command tags all observations, the -gen- command only those > that I expect to be tagged. Are there better solutions that can also > be used with ten or more variables? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: Check if variables have same value***From:*"Friedrich Huebler" <fhuebler@gmail.com>

**References**:**st: Check if variables have same value***From:*"Friedrich Huebler" <fhuebler@gmail.com>

- Prev by Date:
**st: Identifier from three variables** - Next by Date:
**st: RE: Identifier from three variables** - Previous by thread:
**Re: st: Check if variables have same value** - Next by thread:
**Re: st: RE: Check if variables have same value** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |