Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Handling missing values for dates


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Handling missing values for dates
Date   Wed, 1 Apr 2009 18:21:20 +0100

What's best for your project is difficult to say without context. 

I'd just draw your attention to -min()- and -max()-. You might want to
work with -min(date1, date2)- or -max(date1, date2)- bearing in mind
that -min()- and -max()- ignore missings to the extent possible. 

Your final line is legal for numeric variables. 

Nick 
n.j.cox@durham.ac.uk 

Tomas M

Suppose I have a several date variables, each constructing a timeline of
events.

Say, Date 1 < date 2 < date 3 < date 4.

They are supposed to follow in chronological order.

For a few cases, suppose date 3 occurs before date 2 (i.e. the values in
Date 3 are in error).

My question is, if I wanted to calculate the window from (Date 1 to Date
2), or (Date 1 to Date 3), would it be better to fill in the data with
the most conservative estimate possible?

For example, if the values for Date 3 are in error, then the LEAST
conservative estimate for (Date 1 to Date 3) is equal to (Date 1 to Date
2).  And by saying (Date 1 to Date 2), etc., I mean calculating the
difference between the dates.

Conversely, the MOST conservative estimate for (Date 1 to Date 3) is
equal to (Date 1 to Date 4), since Date 3 is supposed to occur between
Date 2 and 4.

What should I do with these missing values then?

Should I just exclude from analyses?

Would this coding work?

replace Date_3 = Date_2 if Date_3==.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index