Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: RE: Merge two long datasets? and re: stopping loops


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: RE: Merge two long datasets? and re: stopping loops
Date   Thu, 24 Aug 2006 16:43:28 +0100

Sorry, but I can't infer your intentions separately from what you 
write. Evidently one message is that your email set-up 
mangles your presentation of Stata code, and I still 
think it was fair comment that that helps nobody. 

The larger issue here is over your use of -tostring-. 

As the original author of -tostring- I do claim some insight 
into what it was intended for and what it was not intended for. 

Regardless of that, I pointed out that (a) users do not need
to change their data to do what is being discussed here (b) 
if they do that they will probably need to change it back. 
Thus, you are recommending code that will have unnecessary side-effects. 
As new users are always joining the list I think it's fair
comment to point that out, as is utterly standard for any 
technical list worthy of the name, especially as there is 
a simple alternative. 

As other output of mine indicates, I think programming style
is important, but I don't think it's a defence here. 

Nick 
n.j.cox@durham.ac.uk 

White, Justin
> 
> The capitalization happened in the email (auto correct).  I did not
> intentionally capitalize the commands.  Isn't this a bit petty?
> 
> The advice I provided is how I would personally do it.  
> Everyone's style
> is different.  There may be hundreds of ways to get the same thing
> accomplished.

Nick Cox
 
> The main advice here is correct. 
> 
> However, the advice on creating 
> a new variable suggests an unnecessary change to 
> your data. There is absolutely no need to apply 
> -tostring- (which normally would need to be reversed
> for much subsequent analysis). -tostring- should be applied
> if and only if there is some numeric variable that 
> really should be string for ever more. 
> 
> egen group = group(momid year) 
> 
> would be a better method of creating a single identifier 
> if you needed it -- but I don't think you do. 
> 
> In addition, I don't think it helps anyone if commands 
> with names like 
> 
> sort 
> tostring 
> gen 
> 
> are capitalised. Finally, for "STATA" read "Stata". 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> White, Justin
>  
> > Yes.  You can merge using two separate variables.  Look in the STATA
> > command help under merge.  There is an example where multiple 
> > variables
> > are referred to.  You must remember that the variables you 
> > use to merge
> > must be how your data set is sorted.  For instance, it you 
> > want to merge
> > using momid and year you must make sure your two data sets 
> > are sorted by
> > momid and year:
> > 
> > Sort momid year
> > 
> > Also, you can consider creating a new variable.  For instance:
> > 
> > Assuming the momid and year variables are string variables
> > Gen str prim_key = momid+year
> > 
> > If they are not strings, you must convert them to a string:
> > Tostring momid year, replace
> > Gen str prim_key = momid+year
>  
> Claire M. Kamp Dush
>  
> > Thanks Scott for the tip.  I did make a mistake in copying my 
> > data, so 
> > thanks for pointing that out.  I have one follow-up and one new
> > question:
> > 
> > First, there is the continue command for breaking out of 
> > loops.  I just 
> > found it in the Stata 9 Programming manual.  So, anyone who 
> > is trying to
> > 
> > figure that out might want to check out that manual [P] under 
> > continue.
> > I 
> > wish I had found it earlier.
> > 
> > Second, before I found that command, I did as you advised, 
> and managed
> > to 
> > merge the data together beautifully.  However, this poses another
> > question:
> > 
> > Is it possible to merge on two variables?  That is, can I merge two 
> > datafiles by momid AND by year at the same time?  Or, is it always 
> > necessary to convert both datasets back to wide form, then 
> > merge, then 
> > reconvert the new dataset to long.  This is what I did.  I have done
> > some 
> > digging to try to figure out how to merge long datasets, and I have
> > always 
> > come up short.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index