Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Carry forward an observation within a time frame

From   David Kantor <>
Subject   Re: st: Carry forward an observation within a time frame
Date   Wed, 09 Oct 2013 00:15:44 -0400

At 09:08 PM 10/8/2013, Benigno Rodriguez wrote:
Dear Stata listers:

I have a panel dataset that consists of multiple visits on multiple subjects at some of which a CD4 value is obtained. My objective is to carry forward the CD4 value to visits for which it is missing, but only if a value is available within the 4 months prior.

I recognize the problem as one of spells, and have read Nick Cox's excellent article on spells from 2007, as well as his column on lists from 2002 (which I recognize as even more relevant to my problem), but despite his heroic efforts at a foolproof introduction to for and its variants, programming does not seem to get through my thick skull easily. Could I get a hand with this, ideally not involving code? Below is a relevant excerpt of the dataset, with the desired result in the last column.

Thank you very much in advance.

patid   date            CD4     desired
1007    5-May-55        .       .
1007    1-Jan-00        .       .
1007    3-Apr-02        5       5
1007    8-Apr-02        .       5
1007    11-Apr-02       .       5
1007    13-May-02       .       5
1007    14-May-02       4       4
1007    17-Jun-02       9       9
1007    12-Nov-02       .       .
1007    27-Jan-03       6       6
1007    17-Mar-03       .       6
1007    14-Apr-03       0       0

There are two issues to be dealt with:
1: determining what is " 4 months prior";
2: carrying values forward.

For the first matter, you need to decide what is meant by " 4 months prior". Is that 120 (or 121) days? Or is it in months that are separated by no more than 4 (e.g.,April to August) regardless of the day-of-month? Or is it within a span of four months, to the same day-of-month (e.g., April 12 to August 12)?

The first option listed is easiest. you can generate a date-difference variable:
by patid (date): gen int datediff = date-date[_n-1]
--then screen for datediff<=120 (or 121 or whatever).

For the second option,...
gen int m = mofd(date)
by patid (date): gen int mdiff = m = m[_n-1]
--then screen on mdiff <=4

For the third option, use the m and mdiff defined above, and...
gen byte d = day(date)
by patid (date): gen byte ddiff = d- d[_n-1]
-- then screen on the condition mdiff<4 | (mdiff==4 & ddiff<=0)

When I write "screen on", I mean to use it in filtering the carrying step -- that is, use it as screening_condition in what follows.

To do the carrying, you can do a direct replace operation, or use carryforward (from SSC):
1: direct replace:
by patid (date): replace CD4 = CD4[_n-1] if mi(CD4) & _n>1 & screening_condition

2: carryforward:
by patid (date): carryforward CD4 if screening_condition, replace

Either of the carrying techniques can be modified to generate a separate variable, rather than replacing the original.

The various operations and expressions that I've outlined to obtain the screening_condition can be folded into a single expression, avoiding the creation of intermediary variables (m,d, mdiff, ddiff, datediff). But it may be easier to manage in the way I've outlined. You may want to generate an indicator variable for that purpose. (If you formulate an expression -- involving [_n-1] -- it can go into the direct replace operation; it can't exactly go into the carryforward, though it may be possible to get the same effect with the dynamic_condition option. It is probably easiest to generate an indicator variable.)

See -help dates- for an explanation of date() and mofd().
See -help carryforward- if you download that module.


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index