Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: Merge two long datasets? and re: stopping loops


From   "White, Justin" <JWhite@yesvirginia.org>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: Merge two long datasets? and re: stopping loops
Date   Thu, 24 Aug 2006 11:14:57 -0400

The capitalization happened in the email (auto correct).  I did not
intentionally capitalize the commands.  Isn't this a bit petty?

The advice I provided is how I would personally do it.  Everyone's style
is different.  There may be hundreds of ways to get the same thing
accomplished.

Justin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Thursday, August 24, 2006 11:08 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: Merge two long datasets? and re: stopping loops

The main advice here is correct. 

However, the advice on creating 
a new variable suggests an unnecessary change to 
your data. There is absolutely no need to apply 
-tostring- (which normally would need to be reversed
for much subsequent analysis). -tostring- should be applied
if and only if there is some numeric variable that 
really should be string for ever more. 

egen group = group(momid year) 

would be a better method of creating a single identifier 
if you needed it -- but I don't think you do. 

In addition, I don't think it helps anyone if commands 
with names like 

sort 
tostring 
gen 

are capitalised. Finally, for "STATA" read "Stata". 

Nick 
n.j.cox@durham.ac.uk 

White, Justin
 
> Yes.  You can merge using two separate variables.  Look in the STATA
> command help under merge.  There is an example where multiple 
> variables
> are referred to.  You must remember that the variables you 
> use to merge
> must be how your data set is sorted.  For instance, it you 
> want to merge
> using momid and year you must make sure your two data sets 
> are sorted by
> momid and year:
> 
> Sort momid year
> 
> Also, you can consider creating a new variable.  For instance:
> 
> Assuming the momid and year variables are string variables
> Gen str prim_key = momid+year
> 
> If they are not strings, you must convert them to a string:
> Tostring momid year, replace
> Gen str prim_key = momid+year
 
Claire M. Kamp Dush
 
> Thanks Scott for the tip.  I did make a mistake in copying my 
> data, so 
> thanks for pointing that out.  I have one follow-up and one new
> question:
> 
> First, there is the continue command for breaking out of 
> loops.  I just 
> found it in the Stata 9 Programming manual.  So, anyone who 
> is trying to
> 
> figure that out might want to check out that manual [P] under 
> continue.
> I 
> wish I had found it earlier.
> 
> Second, before I found that command, I did as you advised, and managed
> to 
> merge the data together beautifully.  However, this poses another
> question:
> 
> Is it possible to merge on two variables?  That is, can I merge two 
> datafiles by momid AND by year at the same time?  Or, is it always 
> necessary to convert both datasets back to wide form, then 
> merge, then 
> reconvert the new dataset to long.  This is what I did.  I have done
> some 
> digging to try to figure out how to merge long datasets, and I have
> always 
> come up short.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index