[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Friedrich Huebler" <fhuebler@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Check if variables have same value |

Date |
Fri, 27 Jul 2007 13:36:25 -0400 |

Sorry, I should have been more precise. I would like to tag individual observations if certain variables do not contain the same values for that particular observation. The purpose is error checking in household survey data. Assume every woman is asked about her age and every man is asked about his wife's age. The information is stored in separate files. When the files are merged, every woman has one age (if she is not married) or two ages. I would like to identify cases where the ages are not the same. -egen, rowmin()- and -egen, rowmax()- work for numeric variables like age but I hope there is a solution that also works with strings. Friedrich On 7/27/07, Nick Cox <n.j.cox@durham.ac.uk> wrote: > Tagging in what sense? > > How do you tell which soldiers are out of step? > Majority vote? How do you split a 50:50 > agreement? Three variables say "Stata" and three > say "SAS"? (No, that's an easy one to identify > which are incorrect.) > > (You didn't mention strings; I guess you don't > care about strings.) > > [...] > > Friedrich Huebler > > > I would like to compare a set of variables and tag those that do not > > contain the same values. Missing values should be ignored. -egen > > newvar = diff(varlist)- is not an option because it does not skip > > missing values. The last command in the example below works but it > > becomes impractical with a longer list of variables. > > > > . sysuse auto > > . gen mpg2 = mpg if foreign==0 > > . gen mpg3 = mpg if foreign==1 > > . replace mpg3 = mpg+1 if rep78==2 > > . egen tag = diff(mpg mpg2 mpg3) > > . gen tag2 = (mpg!=mpg2 & mpg<. & mpg2<. | mpg!=mpg3 & mpg<. & mpg3<. > > | mpg2!=mpg3 & mpg2<. & mpg3<.) > > > > The -egen- command tags all observations, the -gen- command only those > > that I expect to be tagged. Are there better solutions that can also > > be used with ten or more variables? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: RE: Check if variables have same value***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**st: Check if variables have same value***From:*"Friedrich Huebler" <fhuebler@gmail.com>

**st: RE: Check if variables have same value***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Check if variables have same value** - Next by Date:
**Re: st: Check if variables have same value** - Previous by thread:
**st: RE: Check if variables have same value** - Next by thread:
**RE: st: RE: Check if variables have same value** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |