Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
Date   Thu, 6 Feb 2014 19:28:58 -0500

As you point out, Nick, Holly has not told us what she wants to do. I'm
not sure either, as she's tried reshapes both long and wide. 

I've analyzed many data sets of reproductive histories, and I always use
the long format. With this format, it's easy to create running totals of
different kinds, for example parity, the number of births + stillbirths
at a given time. We also can merge in dates of different risk factor
exposures and so easily assign a prior or recent exposure to each
pregnancy. In fact, we always *collect* the data in long format,
as it shortens the codebook and greatly simplifies the edit process. It
also allows a woman to recall a pregnancy out of order, because one
can re-order by date.

Steve
[email protected]


On Feb 6, 2014, at 6:14 PM, Nick Cox <[email protected]> wrote:

Thanks. That helps. I have already explained why -reshape wide-
doesn't work. -reshape- maps one variable to several, but the "birth"
you feed to it is the stub for several variables.

Otherwise, I don't see why you want or need to -reshape- at all.
Several variables are repeated for each woman and some vary. Depending
on what you want to do, you either reduce the dataset by removing
duplicates or keep the whole dataset.

Alternatively, if you explain why you (think you) need to -reshape-
that might illuminate what is being misunderstood, or what you need to
do.
Nick
[email protected]


On 6 February 2014 20:31, Holly E Reed <[email protected]> wrote:
> Hi Nick,
> 
> Here is an extract from the dataset:
> 
> i_weight    masterid  birth1  birth2  birth3  birth4  hhid  id  provage12  urbanage12  evermig  relation              sex  age  cyear1  cyear2  cyear 3  yrbirth  yrdate
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1979
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1980
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1981
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1982
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1983
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1984
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1985
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1986
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1987
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1988
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1989
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1990
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1991
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1992
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1993
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1994
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1995
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1996
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1997
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1998
> 37.64478   43           1998  .         .         .          4     17  "E Cape"   "Rural"         1           "Son/Daughter"   "F"  21    1908    1909     1910      1979    1999
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1982
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1983
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1984
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1985
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1986
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1987
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1988
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1989
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1990
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1991
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1992
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1993
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1994
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1995
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1996
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1997
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1998
> 37.64478   54           .        .         .         .          5     23  "W Cape"  "Urban"        0           "Son/Daughter"   "F"  18    1908    1909     1910      1982    1999
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1933
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1934
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1935
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1936
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1937
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1938
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1939
> 37.64478   111         1955  1959   1964  1966    11    51   "E Cape"   "Rural"        1            "Head"               "F"  67    1908    1909     1910      1933    1940
> 
> And so on...So these are three women, first is age 21, with one birth in 1998; second is age 18, with no births; third is 67 years old, with four births (she actually had seven, but I didn't show all birth* variables here to save space) in 1955, 1959, 1964, and 1966.
> 
> I only listed birth1-birth4, but in fact there are birth 1-birth10 which follow along the same lines.  Also cyear1-cyear93 are in the dataset; same for each person and each year 1908-2000.
> 
> In terms of code, I have tried three iterations of coding for the reshape command:
> 
> reshape long birth, i(id) j(year)
> 
> reshape long birth cyear, i(id) j(year)
> 
> AND
> reshape wide birth, i(id) j(year)
> 
> Hoping that this helps to clarify things a bit...any ideas?
> 
> Thanks,
> Holly
> _______________________________________________
> From      Nick Cox <[email protected]>
> To        "[email protected]" <[email protected]>
> Subject   Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> Date      Thu, 6 Feb 2014 19:30:53 +0000
> Without seeing exactly the kind of data and exactly the kind of code
> that produce problems, it is very hard to comment further. We are not
> asking to see the whole dataset, but enough that is concrete to
> understand your problem.
> 
> If you have variables -birth*- then -reshape wide birth- will
> inevitably fail, but why -reshape long- will fail is unclear.
> 
> Nick
> [email protected]
> ________________________________________
> From: Holly E Reed
> Sent: Thursday, February 06, 2014 1:27 PM
> To: [email protected]
> Subject: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> 
> Hi Ronnie,
> 
> Thanks for your reply.  That is, in fact, exactly what my data look like; of course, some people do not have births, so they have missing values for birth1, birth2, etc. or if they only have one child, they have missing values for all birth variables except birth1.
> 
> The dataset is so large and there are a number of variables in addition to the ones listed, such as weights, region at age 12, urban/rural at age 12, relationship to HH head, ever migrated...that's why I didn't post a sample of the actual dataset.
> 
> Thanks,
> Holly
> _______________________________________________________
> From      Ronnie Babigumira <[email protected]>
> To        [email protected]
> Subject   Re: st: RE: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> Date      Thu, 6 Feb 2014 09:40:12 +0100
> Showing an example of the actual data you are trying to reshape will
> help because, following your previous posting, the solution Maarten
> shared, and this new information about birth1---10, this is what you
> would be trying to reshape.
> 
>    id   age   sex   birthyear   year   birth1   birth2
>     1     5     F        1995   1995     2010     2012
>     1     5     F        1995   1996     2010     2012
>     1     5     F        1995   1997     2010     2012
>     1     5     F        1995   1998     2010     2012
>     1     5     F        1995   1999     2010     2012
>     2     3     M        1997   1997     2012     2013
>     2     3     M        1997   1998     2012     2013
>     2     3     M        1997   1999     2012     2013
>     3    10     F        1998   1998     2009     2011
> 
> I doubt that your data look like this
> 
> Ronnie
> ________________________________________
> From: Holly E Reed
> Sent: Wednesday, February 05, 2014 2:23 PM
> To: [email protected]
> Subject: RE: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> 
> Thank you for your help, Maarten.  It worked great.  But now I am receiving error messages when I try to reshape the data.  No matter how I reshape, it tells me that the data are already in that format: "Data are already wide" or "Data are already long"  I have tried to do this several times, but with no luck yet.
> 
> This is my code:
> 
> reshape wide birth, i(id) j(year)
> 
> birth is a variable with suffix 1-10 (e.g., birth1, birth2, birth3, etc.) which is the year of a woman's first birth, second birth etc.
> 
> Sometimes the error message says "variable year not found"; I thought that year was a new variable that would be created? And once it said "i=id does not uniquely identify the observations; there are multiple observations with the same value of id." But I thought that was the point!?
> 
> If you can shed some light on these issues, I would appreciate it!
> Thanks, Holly
> _____________________________________
> 
> by id : gen year = birthyear + _n -1
> 
> also look at -help stsplit- as that command is there for creating such datasets.
> 
> Hope this helps,
> Maarten
> __________________________________________
> Does the age of the person increase each year?
> 
> If so, you could use:
> gen year = age+birthyear
> 
> If age does not increase each year, how do you know which year an
> observation belongs to?
> For example, how do you know the records aren't sorted like this:
> 
> id     age    sex    birthyear    year
> 1       5        F       1995        1999
> 1       5        F       1995        1998
> 1       5        F       1995        1997
> 1       5        F       1995        1996
> 1       5        F       1995        1995
> 
> 
> Mike
> 
> _______________________________________
> From: Holly E Reed
> Sent: Wednesday, February 05, 2014 12:03 PM
> To: [email protected]
> Subject: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> 
> Hi,
> 
> I am trying to create a person-year dataset for event history analysis. The dataset currently has one observation per person per year of their life, e.g.:
> 
> id     age    sex    birthyear
> 1       5        F       1995
> 1       5        F       1995
> 1       5        F       1995
> 1       5        F       1995
> 1       5        F       1995
> 2       3        M      1997
> 2       3        M      1997
> 2       3        M      1997
> 
> So person with id==1 is a 5-year old female born in 1995 and person with id==2 is a 3-year old male born in 1997. This is a simplified example to illustrate the dataset, as they are all adults and there are far more observations for each individual.
> 
> The problem is that I have age and birthyear variables, but I want to create a calendar year variable before reshaping the data to person-year data.  What is the easiest way to do this?  In other words, I want the dataset to look like this:
> 
> id     age    sex    birthyear    year
> 1       5        F       1995        1995
> 1       5        F       1995        1996
> 1       5        F       1995        1997
> 1       5        F       1995        1998
> 1       5        F       1995        1999
> 2       3        M      1997        1997
> 2       3        M      1997        1998
> 2       3        M      1997        1999
> 
> Thank you very much for any help you can give me!
> Holly
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index