[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
n j cox <n.j.cox@durham.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: creating differences when time periods are misssing |

Date |
Thu, 01 Nov 2007 09:54:17 +0000 |

This problem is a twist away from one discussed in an FAQ:

How can I replace missing values with previous or following nonmissing values or within sequences?

http://www.stata.com/support/faqs/data/missing.html

Scott only needs to fill in gaps of missings with the previous value,

then all is plain sailing.

gen amo2 = amo

bysort id (year) : replace amo2 = amo2[_n-1] if missing(amo)

Then

by id : gen dt2 = cond(amo == ., ., d.amo2)

I would make no claims about efficiency except that this should beat

1. any loop

2. fixing by hand

This should also fix gaps longer than one year.

Nick

n.j.cox@durham.ac.uk

Scott Cunningham

--------------------------------------------------------------------------------

My data is a longitudinal dataset of individuals who were interviewed

from 1997 to 2004. I have data on individual ages (measured as months

from birth month). Because this interview did not always,

consistently, ask individuals exactly 12 months after the last

interview, I have been trying to control for differences in time since

the last interview by differencing their ages as so:

. gen dt=d.amo

where "amo" is "age in months." I notice that this works so long as I

have values of amo in both the current and previous year. But there

are some people who disappear from the survey only to return a year

later. They look like this:

+----------------------------------+

| id rp age amo dt year |

|----------------------------------|

56. | 27 0 15 189 12 1997 |

57. | 27 . . . . 1998 |

58. | 27 4 18 226 . 1999 |

59. | 27 3 19 237 11 2000 |

60. | 27 8 20 247 10 2001 |

61. | 27 6 21 259 12 2002 |

62. | 27 4 22 273 14 2003 |

|----------------------------------|

63. | 27 1 23 283 10 2004 |

The relevant variables are: id (indiciating this is the same person),

amo (age in months on day of interview), dt (time since last

interview), and year. Ignore the "rp" variable, but note that this

variable measures something which depends on "dt" since it is a

measure of something done since the date of the last interview.

So, the problem is "dt" is missing twice. Once when all values are

missing because the person was not interviewed. A second time when he

comes back in. Ideally, I would like to know how to create

differenced values for dt equal to (226-189), since the respondent is

226 months old on the day of the interview and was 189 the last time

interviewed. What's the most efficient code to do this?

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: creating differences when time periods are misssing***From:*"Scott Cunningham" <scunning@gmail.com>

- Prev by Date:
**Re: st: how to write the pr meaning?** - Next by Date:
**Re: Re: st: SV: -spineplot- available from SSC** - Previous by thread:
**Re:st:How to write the pr meaning** - Next by thread:
**Re: st: creating differences when time periods are misssing** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |