Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Check if variables have same value


From   "Friedrich Huebler" <fhuebler@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Check if variables have same value
Date   Sun, 29 Jul 2007 14:14:48 -0400

Thank you, Nick.

Friedrich

On 7/29/07, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> On strings: this is an FAQ.
>
> FAQ     . . . . . . . . .  Counting distinct strings across a set of variables
>         7/04    How do I count the number of distinct strings
>                 across a set of variables?
>                 http://www.stata.com/support/faqs/data/distinctstrings.html
>
> There is a typo in the very last line of code.
>
> if `v'[`i'] != . & trim(`v'[`i']) != ""
>
> should be
>
> if `v'[`i'] != "." & trim(`v'[`i']) != ""
>
> Nick
> n.j.cox@durham.ac.uk
>
> Friedrich Huebler
>
> > Sorry, I should have been more precise. I would like to tag individual
> > observations if certain variables do not contain the same values for
> > that particular observation.
> >
> > The purpose is error checking in household survey data. Assume every
> > woman is asked about her age and every man is asked about his wife's
> > age. The information is stored in separate files. When the files are
> > merged, every woman has one age (if she is not married) or two ages. I
> > would like to identify cases where the ages are not the same.
> >
> > -egen, rowmin()- and -egen, rowmax()- work for numeric variables like
> > age but I hope there is a solution that also works with strings.
>
> Nick Cox
>
> > > Tagging in what sense?
> > >
> > > How do you tell which soldiers are out of step?
> > > Majority vote? How do you split a 50:50
> > > agreement? Three variables say "Stata" and three
> > > say "SAS"? (No, that's an easy one to identify
> > > which are incorrect.)
> > >
> > > (You didn't mention strings; I guess you don't
> > > care about strings.)
> > >
> > > [...]
> > >
> > > Friedrich Huebler
> > >
> > > > I would like to compare a set of variables and tag those
> > that do not
> > > > contain the same values. Missing values should be ignored. -egen
> > > > newvar = diff(varlist)- is not an option because it does not skip
> > > > missing values. The last command in the example below works but it
> > > > becomes impractical with a longer list of variables.
> > > >
> > > > . sysuse auto
> > > > . gen mpg2 = mpg if foreign==0
> > > > . gen mpg3 = mpg if foreign==1
> > > > . replace mpg3 = mpg+1 if rep78==2
> > > > . egen tag = diff(mpg mpg2 mpg3)
> > > > . gen tag2 = (mpg!=mpg2 & mpg<. & mpg2<. | mpg!=mpg3 &
> > mpg<. & mpg3<.
> > > > | mpg2!=mpg3 & mpg2<. & mpg3<.)
> > > >
> > > > The -egen- command tags all observations, the -gen-
> > command only those
> > > > that I expect to be tagged. Are there better solutions
> > that can also
> > > > be used with ten or more variables?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index