Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Merge two long datasets? and re: stopping loops


From   "Claire M. Kamp Dush" <[email protected]>
To   [email protected]
Subject   st: Merge two long datasets? and re: stopping loops
Date   Thu, 24 Aug 2006 10:42:36 -0400

Thanks Scott for the tip. I did make a mistake in copying my data, so thanks for pointing that out. I have one follow-up and one new question:

First, there is the continue command for breaking out of loops. I just found it in the Stata 9 Programming manual. So, anyone who is trying to figure that out might want to check out that manual [P] under continue. I wish I had found it earlier.

Second, before I found that command, I did as you advised, and managed to merge the data together beautifully. However, this poses another question:

Is it possible to merge on two variables? That is, can I merge two datafiles by momid AND by year at the same time? Or, is it always necessary to convert both datasets back to wide form, then merge, then reconvert the new dataset to long. This is what I did. I have done some digging to try to figure out how to merge long datasets, and I have always come up short.

Claire



At 03:27 PM 8/23/2006, [email protected] wrote:

I didn't read through all your code, but perhaps, using -merge- can
accomplish your goal. See the exmaple below.  Also it is not clear
how, starting with your initial data set, you get the 1991 Divorced
and 1992 Remarried for momid 2 in your final data set.

Scott


. l , noobs sepby(mom)

  +---------------------------------------------------+
  | momid   year       type1     y1      type2     y2 |
  |---------------------------------------------------|
  |     1   2000     Married   2000                 . |
  |     1   2001                  .                 . |
  |     1   2002   Separated   2001   Divorced   2001 |
  |     1   2003                  .                 . |
  |     1   2004                  .                 . |
  |---------------------------------------------------|
  |     2   1988     Married   1987                 . |
  |     2   1989                  .                 . |
  |     2   1990                  .                 . |
  |     2   1991                  .                 . |
  |     2   1992                  .                 . |
  |     2   1993                  .                 . |
  |     2   1994                  .                 . |
  |     2   1995                  .                 . |
  |     2   1996    Divorced   1993                 . |
  |     2   1997                  .                 . |
  |     2   1998                  .                 . |
  |     2   1999                  .                 . |
  |     2   2000   Remarried   1998                 . |
  +---------------------------------------------------+

. drop year

. rename y1 year

. sort mom year

. merge mom year using "C:\Documents and
Settings\scott.merryman\Desktop\foo.dta"
variables momid year do not uniquely identify observations in the
master data

. drop if year ==.
(13 observations deleted)

. drop _m

. sort mom year

. order mom year type1 y1 type2 y2

. l, noob sepby(mom)

  +---------------------------------------------------+
  | momid   year       type1     y1      type2     y2 |
  |---------------------------------------------------|
  |     1   2000     Married   2000                 . |
  |     1   2001   Separated      .   Divorced   2001 |
  |     1   2002   Separated   2001   Divorced   2001 |
  |     1   2003                  .                 . |
  |     1   2004                  .                 . |
  |---------------------------------------------------|
  |     2   1987     Married      .                 . |
  |     2   1988     Married   1987                 . |
  |     2   1989                  .                 . |
  |     2   1990                  .                 . |
  |     2   1991                  .                 . |
  |     2   1992                  .                 . |
  |     2   1993    Divorced      .                 . |
  |     2   1994                  .                 . |
  |     2   1995                  .                 . |
  |     2   1996    Divorced   1993                 . |
  |     2   1997                  .                 . |
  |     2   1998   Remarried      .                 . |
  |     2   1999                  .                 . |
  |     2   2000   Remarried   1998                 . |
  +---------------------------------------------------+



----- Original Message -----
From: "Claire M. Kamp Dush" <[email protected]>
Date: Wednesday, August 23, 2006 12:52 pm
Subject: st: programming: stopping loops?
To: [email protected]

> Hello, I feel embarrassed to post this because I am sure the
> answer to this
> is obvious, but I have been puzzling over this issue for a few
> hours.  I am
> trying to recode the family structure data in the NLSY 79 through
> 2004.  I
> am trying to go back and recode the data for missing years based
> on reports
> of marital changes between interviews at follow-ups.  For
> instance, if an
> individual was interviewed in 1991 and not in 1992, in 1993 they
> are asked
> to report up to 3 marital changes since the last time they were
> interviewed.  My data is stacked, with each individual having 26
> lines of
> data, for years 1979 through 2004.  The id variable is momid and
> the year
> variable is year.  change1type, change2type, and change3type are
> measured
> each year where the respondent has data, and is a categorical
> variable with
> categories including married, divorced, separated, widowed, etc.
> changey1_
> , changey2_, and changey3_ are the years in which the each change
> is said
> to occur.  Here is an example of what the data look like:
>
> momid   year    change1type     changey1_       change2type
> changey2_1               2000    Married         2000
> 1               2001
> 1               2002    Separated       2001            Divorced
>     2001
> 1               2003
> 1               2004
> 2               1988    Married 1987
> 2               1989
> 2               1990
> 2               1991
> 2               1992
> 2               1993
> 2               1994
> 2               1995
> 2               1996    Divorced        1993
> 2               1997
> 2               1998
> 2               1999
> 2               2000    Remarried       1998
>
> My goal is to have my data look like the following:
>
> momid   year    change1type     changey1_       change2type
> changey2_
>     change1misstype         change2misstype
> 1               2000    Married         2000
>             Married
> 1               2001
>             Separated               Divorced
> 1               2002    Separated       2001            Divorced
>     2001
> 1               2003
> 1               2004
> 2               1987
>             Married
> 2               1988    Married         1987
> 2               1989
> 2               1990
> 2               1991    Divorced        1991
>             Divorced                Remarried
> 2               1992    Remarried       1991
> 2               1993
>             Divorced
> 2               1994
> 2               1995
> 2               1996    Divorced        1993
> 2               1997
> 2               1998
>             Remarried
> 2               1999
> 2               2000    Remarried       1998
>


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Claire M. Kamp Dush, Ph.D.
Postdoctoral Fellow, Evolving Family Theme Project
Cornell University
Bronfenbrenner Life Course Center
Bebee Hall
Ithaca, NY  14853
607-255-9908
http://www.socialsciences.cornell.edu/0407/evolv_fam_desc.html

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index