Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Check if variables have same value


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Check if variables have same value
Date   Fri, 27 Jul 2007 17:59:07 +0100

Tagging in what sense? 

How do you tell which soldiers are out of step? 
Majority vote? How do you split a 50:50 
agreement? Three variables say "Stata" and three
say "SAS"? (No, that's an easy one to identify
which are incorrect.) 

(You didn't mention strings; I guess you don't 
care about strings.) 

One way forward is to note that the row maximum
and the minimum will be different if any values
differ. See -egen, rowmin()- or -egen, rowmax()-. 

Another is that the row median will identify 
a majority in all cases where it exists. See 
-egenmore- for a row median function. See

http://www.stata.com/support/faqs/stat/medians.html

for a fuller story.

In the case where a median is the midpoint between
two values, then contrariwise, all values disagree
with it. You might still have a plurality somewhere. 

All these functions ignore missings in the way 
you want. 

Yet another story is to -reshape-. Then see also 

http://www.stata.com/support/faqs/data/distinct.html

http://www.stata.com/support/faqs/data/diff.html

Nick 
n.j.cox@durham.ac.uk 

Friedrich Huebler
 
> I would like to compare a set of variables and tag those that do not
> contain the same values. Missing values should be ignored. -egen
> newvar = diff(varlist)- is not an option because it does not skip
> missing values. The last command in the example below works but it
> becomes impractical with a longer list of variables.
> 
> . sysuse auto
> . gen mpg2 = mpg if foreign==0
> . gen mpg3 = mpg if foreign==1
> . replace mpg3 = mpg+1 if rep78==2
> . egen tag = diff(mpg mpg2 mpg3)
> . gen tag2 = (mpg!=mpg2 & mpg<. & mpg2<. | mpg!=mpg3 & mpg<. & mpg3<.
> | mpg2!=mpg3 & mpg2<. & mpg3<.)
> 
> The -egen- command tags all observations, the -gen- command only those
> that I expect to be tagged. Are there better solutions that can also
> be used with ten or more variables?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index