Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to turn my date variable into a variable Stata.10 can recognise?


From   Ekaterina Hertog <[email protected]>
To   [email protected]
Subject   Re: st: How to turn my date variable into a variable Stata.10 can recognise?
Date   Thu, 19 Mar 2009 21:58:10 +0000

Thank you very much for such detailed advice and explanation of where I made a mistake. I am really new to Stata, and am realising I am not using the available information resources very well. I will work on it. Thank you very much for references to useful resources at the end of you email.
Sincerely yours,
Ekaterina

Michael Hanson wrote:
On Mar 19, 2009, at 4:13 PM, Ekaterina Hertog wrote:

I have got a dataset which contain dates of birth for individuals and these dates of birth look as follows: 19560413 and I am trying to turn them into date variables Stata can recognise.

To explore this issue, let's first create a simple toy dataset:

// Begin part 1 of example
input date_of_birth
19560413
19601223
19550721
19700105
end
list
// End part 1 of example


It is a numeric variable and I have turned it into string.

OK, but we can roll that step into those listed below, rather than create an extra variable that you likely won't need later anyway.


The problem is that the following approach:

gen birth_date = date(strbirth_date, "DMY")
format birth_date %td

does not work I just get missing values. Presumably that is because my date variable is not in the order: day - month - year, but rather year - month - day.

So then do not tell Stata to use the wrong order!  Consider:

// Begin part 2 of example
gen birth_date = date(string(date_of_birth,"%8.0f"),"YMD")
format birth_date %td
list
// End part 2 of example

Notice the use of "YMD" -- the order in while the date elements appear -- rather than "DMY". This is alluded to in -help dates_and_times- when the "mask" of the -date()- function is mentioned; since only one example ("MDY") is given for -date()-, one might be forgiven for thinking that other masks are not possible. Yet your attempted mask doesn't match the example in the help file... nor is it appropriate for your data.


I then thought I would redo the variable into a correct order and first tried to create 3 separate string variables out of each date: one for year, one for month and one for day.

I tried to do it as follows:
generate strbirth_date= string(date_of_birth, "%08.0f")
gen yob = substr(strbirth_date,1,4)
gen mob = substr(strbirth_date,5,6)
gen dob = substr(strbirth_date,7,8)

As a result 19560413 turned into: yob=1956
mob=0413
dob=13

I do not understand why did the month of birth (mob) did not transform correctly and what can I do next.

Perhaps you thought Stata was Excel, or some other program(ming language) in which you specify the starting and ending characters for your substring extraction? But in -help string_functions-, it is clearly explained that the first number (n1) in -substr(s, n1, n2)- is the position from the start of the string, but the second number (n2) is the *length* of the substring. Hence, the correct way to extract the date elements you want is:

// Begin part 3 of example
gen yob = substr(string(date_of_birth,"%8.0f"),1,4)
gen mob = substr(string(date_of_birth,"%8.0f"),5,2)
gen dob = substr(string(date_of_birth,"%8.0f"),7,2)
list
// End part 3 of example


I would be very grateful for any advice as to how I can turn my date variable into a variable Stata10 can recognise,

The date (and string) functions in Stata are powerful, so they are worth learning. However, to use them correctly, there really is no substitute for reading the help files (or printed manuals) carefully.

Hope this helps,
Mike


P.S. The specification of the mask for the -date()- function has changed from lower case in Stata 9 and earlier to upper case in Stata 10 (and, I suspect, later). This can cause older programs that use the -date()- function, originally written for earlier versions of Stata, to misbehave or outright fail when run with Stata 10. A -version 9- command at the start of the program should remedy that situation, although I haven't checked.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
Ekaterina Hertog (née Korobtseva)
Career Development Fellow
Department of Sociology and Nissan Institute of Japanese Studies
University of Oxford

27 Winchester Road
Oxford
OX2 6NA
United Kingdom


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index