Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: combining two variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: combining two variables
Date   Wed, 7 Mar 2007 21:37:39 -0000

Austin Nichols asked a similar question
privately. And his answer given publicly
(which I did not read until I had posted mine) 
also raised warnings about various 
aspects of the problem, just as you rightly do. 

I wrote that line of code knowing that var1 and var2 might be 
both not missing, but different; or both not missing, 
and the same. I don't think Nora specified what she wanted for those
situations. 

In the absence of rules, missing (don't 
know) is an appropriate answer. This is Stata's 
attitude, and that was mine here. 

A more complete answer might be 

gen newvar = 
	cond(missing(var1), var2, 
	cond(missing(var2), var1, 
	cond(var1 == var2,  var1, 
	. ))) 

That way missing is returned if  
both variables are missing or 
if both variables are non-missing but different. 

Nick 
n.j.cox@durham.ac.uk 

Sergiy Radyakin
 
> Why would one need "if missing(var1, var2)" ?
> 
> gen newvar=max(var1,var2)
> 
> produces the same result, unless:
> 
> Ambiguity exists, since Paswel Phiri Marenya wanted to "combine the 
> variables in such a way that the data from one variable can 
> replace the 
> missing values from the other". It is as above if the 
> situation of both 
> non-missings is impossible (say, by way the variables are 
> created). If, 
> however, it is possible, then NGC's solution will yield new 
> missings, which 
> contradicts "data ... replace the missings". So because of 
> the way the 
> question is formulated, and because it's not known if both 
> variables may 
> have non-missings, you may wish to consider this code:
> 
> gen newvar=var1
> replace newvar=var2 if var1==.

Nick Cox
 
> > Some very complicated solutions here!
> >
> > Consider:
> >
> > gen newvar = max(var1, var2) if missing(var1, var2)
> >
> > Logic:
> >
> > If just one of var1 or var2 is missing,
> > then -missing()- will return true.
> >
> > In that case, max(var1, var2) will
> > return the non-missing value in question.
> >
> > If both are missing, then you get missing returned,
> > but that is fair enough.
> >
> > If neither is missing, missing is returned.

> > Paswel Phiri Marenya
> >
> >> What to me is easier is something like:
> >>
> >> gen var3 = var2
> >> then replace var3=var1 in 4/5 and so on...although with a
> >> long data set it
> >> may be tedious perhaps.
> >> regrads
> >> PPm
> >>
> >> > Thank you to everyone who answered my last question on
> >> > creating a variable corresponding to the row number.
> >> >
> >> > Now I have a question about combining the data from two
> >> > variables (in the same data set) into one variable. I want to
> >> > combine the variables in such a way that the data from one
> >> > variable can replace the missing values from the other. I have
> >> > created an example of what I am looking for below:
> >> >
> >> >
> >> >            Have:                   Want:
> >> >       var1       var2              newvar
> >> >         .          2                 2
> >> >         .          8                 8
> >> >         .          0                 0
> >> >         7          .                 7
> >> >         3          .                 3
> >> >         .          .                 .
> >> >
> >> > Any thoughts on how to do this?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index