Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: FW: RE: RE: Merge two long datasets? and re: stopping loops

From   "White, Justin" <>
To   <>
Subject   st: FW: RE: RE: Merge two long datasets? and re: stopping loops
Date   Thu, 24 Aug 2006 11:39:19 -0400

Nick, I apologize if I was harsh in my response.  

Justin White

-----Original Message-----
From: White, Justin 
Sent: Thursday, August 24, 2006 11:15 AM
To: ''
Subject: RE: RE: RE: Merge two long datasets? and re: stopping loops

The capitalization happened in the email (auto correct).  I did not
intentionally capitalize the commands.  Isn't this a bit petty?

The advice I provided is how I would personally do it.  Everyone's style
is different.  There may be hundreds of ways to get the same thing


-----Original Message-----
[] On Behalf Of Nick Cox
Sent: Thursday, August 24, 2006 11:08 AM
Subject: st: RE: RE: Merge two long datasets? and re: stopping loops

The main advice here is correct. 

However, the advice on creating 
a new variable suggests an unnecessary change to 
your data. There is absolutely no need to apply 
-tostring- (which normally would need to be reversed
for much subsequent analysis). -tostring- should be applied
if and only if there is some numeric variable that 
really should be string for ever more. 

egen group = group(momid year) 

would be a better method of creating a single identifier 
if you needed it -- but I don't think you do. 

In addition, I don't think it helps anyone if commands 
with names like 


are capitalised. Finally, for "STATA" read "Stata". 


White, Justin
> Yes.  You can merge using two separate variables.  Look in the STATA
> command help under merge.  There is an example where multiple 
> variables
> are referred to.  You must remember that the variables you 
> use to merge
> must be how your data set is sorted.  For instance, it you 
> want to merge
> using momid and year you must make sure your two data sets 
> are sorted by
> momid and year:
> Sort momid year
> Also, you can consider creating a new variable.  For instance:
> Assuming the momid and year variables are string variables
> Gen str prim_key = momid+year
> If they are not strings, you must convert them to a string:
> Tostring momid year, replace
> Gen str prim_key = momid+year
Claire M. Kamp Dush
> Thanks Scott for the tip.  I did make a mistake in copying my 
> data, so 
> thanks for pointing that out.  I have one follow-up and one new
> question:
> First, there is the continue command for breaking out of 
> loops.  I just 
> found it in the Stata 9 Programming manual.  So, anyone who 
> is trying to
> figure that out might want to check out that manual [P] under 
> continue.
> I 
> wish I had found it earlier.
> Second, before I found that command, I did as you advised, and managed
> to 
> merge the data together beautifully.  However, this poses another
> question:
> Is it possible to merge on two variables?  That is, can I merge two 
> datafiles by momid AND by year at the same time?  Or, is it always 
> necessary to convert both datasets back to wide form, then 
> merge, then 
> reconvert the new dataset to long.  This is what I did.  I have done
> some 
> digging to try to figure out how to merge long datasets, and I have
> always 
> come up short.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index