|  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to turn my date variable into a variable Stata.10 can recognise?
| From | Michael Hanson <[email protected]> | 
| To | [email protected] | 
| Subject | Re: st: How to turn my date variable into a variable Stata.10 can recognise? | 
| Date | Thu, 19 Mar 2009 17:47:47 -0400 | 
On Mar 19, 2009, at 4:13 PM, Ekaterina Hertog wrote:
I have got a dataset which contain dates of birth for individuals  
and these dates of birth look as follows: 19560413 and I am trying  
to turn them into date variables Stata can recognise.
To explore this issue, let's first create a simple toy dataset:
// Begin part 1 of example
input date_of_birth
19560413
19601223
19550721
19700105
end
list
// End part 1 of example
It is a numeric variable and I have turned it into string.
OK, but we can roll that step into those listed below, rather than  
create an extra variable that you likely won't need later anyway.
The problem is that the following approach:
gen birth_date = date(strbirth_date, "DMY")
format birth_date %td
does not work I just get missing values. Presumably that is because  
my date variable is not in the order: day - month - year, but  
rather year - month - day.
So then do not tell Stata to use the wrong order!  Consider:
// Begin part 2 of example
gen birth_date = date(string(date_of_birth,"%8.0f"),"YMD")
format birth_date %td
list
// End part 2 of example
Notice the use of "YMD" -- the order in while the date elements  
appear -- rather than "DMY".  This is alluded to in -help  
dates_and_times- when the "mask" of the -date()- function is  
mentioned;  since only one example ("MDY") is given for -date()-, one  
might be forgiven for thinking that other masks are not possible.   
Yet your attempted mask doesn't match the example in the help file...  
nor is it appropriate for your data.
I then thought I would redo the variable into a correct order and  
first tried to create 3 separate string variables out of each date:  
one for year, one for month and one for day.
I tried to do it as follows:
generate strbirth_date= string(date_of_birth, "%08.0f")
gen yob = substr(strbirth_date,1,4)
gen mob = substr(strbirth_date,5,6)
gen dob = substr(strbirth_date,7,8)
As a result 19560413 turned into: yob=1956
mob=0413
dob=13
I do not understand why did the month of birth (mob) did not  
transform correctly and what can I do next.
Perhaps you thought Stata was Excel, or some other program(ming  
language) in which you specify the starting and ending characters for  
your substring extraction?  But in -help string_functions-, it is  
clearly explained that the first number (n1) in -substr(s, n1, n2)-  
is the position from the start of the string, but the second number  
(n2) is the *length* of the substring.  Hence, the correct way to  
extract the date elements you want is:
// Begin part 3 of example
gen yob = substr(string(date_of_birth,"%8.0f"),1,4)
gen mob = substr(string(date_of_birth,"%8.0f"),5,2)
gen dob = substr(string(date_of_birth,"%8.0f"),7,2)
list
// End part 3 of example
I would be very grateful for any advice as to how I can turn my  
date variable into a variable Stata10 can recognise,
The date (and string) functions in Stata are powerful, so they are  
worth learning.  However, to use them correctly, there really is no  
substitute for reading the help files (or printed manuals) carefully.
Hope this helps,
Mike
P.S. The specification of the mask for the -date()- function has  
changed from lower case in Stata 9 and earlier to upper case in Stata  
10 (and, I suspect, later).  This can cause older programs that use  
the -date()- function, originally written for earlier versions of  
Stata, to misbehave or outright fail when run with Stata 10.  A - 
version 9- command at the start of the program should remedy that  
situation, although I haven't checked.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/