<>
So it seems Ekaterina can employ several methods, based on the -cond-
function. She can read up on best practice with regard to this function in
Nick`s column http://www.stata-journal.com/sjpdf.html?articlenum=pr0016
If she is willing to make sure that all her two digit years hail from the
20th century, she could
*************
clear*
input edu_start_date_1 // :mylabel , auto // str10 double byte
197104
197504
196504
196904
8804
8404
6304
8304
end
compress
tostring edu_start_date_1, gen(stredu1st)
gen mydate=cond(length(stredu1st)>4, ///
stredu1st, ///
substr(stredu1st,1,2)+" "+substr(stredu1st,3,4))
gen edu1st = cond(length(stredu1st)>4, ///
date(mydate, "YM"), ///
date(mydate, "19YM"))
format edu1st %td
list edu_start_date_1 mydate edu1st, noobs
*************
force a hole into the string after the year digits, and the date function
would understand her.
Alternatively, she could make the same assumption and add the "19" to all
four digit strings:
******************
clear*
input edu_start_date_1 // :mylabel , auto // str10 double byte
197104
197504
196504
196904
8804
8404
6304
8304
end
compress
tostring edu_start_date_1, gen(stredu1st)
gen mydate=cond(length(stredu1st)>4, ///
stredu1st, ///
"19"+stredu1st)
gen edu1st = date(mydate, "YM")
format edu1st %td
list edu_start_date_1 mydate edu1st, noobs
******************
She should check very carefully whether the results match her expectations
:-)
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Ekaterina
Hertog
Gesendet: Sonntag, 3. Mai 2009 15:48
An: [email protected]
Betreff: st: turning numbers into dates
Dear all,
I have got a variable containing the month and year an individual started
his or her education. Only Stata thinks the values in this variable are
numbers and I want to turn them into dates.
If all the numbers followed the same pattern that will not be a problem.
for example I could do it like this:
tostring edu_start_date_1, gen(stredu1st)
gen edu1st = date(stredu1st, "YM")
My problem is that while most dates in my dataset come in the yyyymm
pattern:
e.g.
+----------+
| stredu~t |
|----------|
1. | . |
2. | 197104 |
3. | 197504 |
4. | 196504 |
5. | 196904 |
|----------|
several contain only yymm
e.g.
+-----------+
| edu_st~1 |
|-----------|
12338. | 8804 |
13265. | 8404 |
13666. | 6304 |
13831. | 8304 |
+-----------+
So when I run
gen edu1st = date(stredu1st, "YM")
all the yymm values in stredu1st are turned into missing values in edu1st.
I could of course edit the values containing only yymm into yyyymm pattern
manually, but this feels imprecise and prone to error and I would like to
automate the process if at all possible.
Is there a way to make the date command recognise alternating patterns?
I would be very grateful for any advice,
Sincerely yours,
Ekaterina
--
Ekaterina Hertog (nee Korobtseva)
Nissan Institute of Japanese Studies
27 Winchester Road, Oxford
OX2 6NA
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/