Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: imputing dates into a string date


From   Sergiy Radyakin <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: imputing dates into a string date
Date   Fri, 7 Jun 2013 15:47:36 -0400

the discussion so far sounds quite confusing, and I still see no
reason to use regular expressions instead of much more intuitive
-replace-.

Here is my approach (type in Stata and not in the browser):

net from http://www.adeptanalytics.org/radyakin/stata/cleandate/

Best, Sergiy.

. clear

.
. input str10 dirty

          dirty
  1. "//"
  2. "14//"
  3. "01/xx/2001"
  4. "xx/01/2001"
  5. "01/01/xxxx"
  6. end

.
. list

     +------------+
     |      dirty |
     |------------|
  1. |         // |
  2. |       14// |
  3. | 01/xx/2001 |
  4. | xx/01/2001 |
  5. | 01/01/xxxx |
     +------------+

.   cleandate dirty, gen(clean) d(15) m(06) y(2012) sep("/")

. list

     +-------------------------+
     |      dirty        clean |
     |-------------------------|
  1. |         //   15/06/2012 |
  2. |       14//   14/06/2012 |
  3. | 01/xx/2001   01/06/2001 |
  4. | xx/01/2001   15/01/2001 |
  5. | 01/01/xxxx   01/01/2012 |
     +-------------------------+






On Fri, Jun 7, 2013 at 8:48 AM, Tim Evans <[email protected]> wrote:
> Nick, Joseph,
>
> Thank you for the comments - and I readily acknowledge the potential bias of imputing dates. The Stata tip is potentially one avenue for me to explore - but unfortunately I don't have access to that.
>
> Best wishes
>
> Tim
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 07 June 2013 13:12
> To: [email protected]
> Subject: Re: st: imputing dates into a string date
>
> Just to spell out what will be obvious to Joseph and Tim: Sometimes other dates provide constraints on what the dates might be.
> Nick
> [email protected]
>
>
> On 7 June 2013 12:51, Joseph Coveney <[email protected]> wrote:
>> Tim Evans wrote:
>>
>> Some time ago I had a problem with imputing dates into a string
>> variable where the date took the form:
>>
>> XX/01/2012
>>
>> In the thread below a solution was provided which worked great,
>> however, I now have data takes the form:
>>
>> /01/2012
>>
>> To this, I would like to impute a day of "1", but having tried to
>> amend the original code below
>>
>> g dx_clean = subinstr(dx, "XX", "01", 1)
>>
>> to
>>
>> g dx_clean = subinstr(dx, "", "01", 1)
>>
>> The result is that I return the same value i.e.
>> XX/01/2012
>>
>> Does anyone have a suggestion of how I can handle this please?
>>
>> ----------------------------------------------------------------------
>> ----------
>>
>> If you've got missing elements other than just the day, it might be
>> better to use -split-, and impute the days, months and years
>> separately with their different defaults.  You can then re-assemble
>> the elements with simple string concatenation (or convert the imputed dates to a Stata date).
>>
>> Joseph Coveney
>>
>> . version 12.1
>>
>> .
>> . clear *
>>
>> . set more off
>>
>> .
>> . input str10 dx
>>
>>              dx
>>   1. "01//2001"
>>   2. "/01/2001"
>>   3. "01/01/"
>>   4. end
>>
>> .
>> . split dx, generate(d_) parse(/)
>> variables created as string:
>> d_1  d_2  d_3
>>
>> . replace d_1 = "15" if missing(d_1) // Missing days as approx.
>> midmonth
>> (1 real change made)
>>
>> . replace d_2 = "06" if missing(d_2) // Missing months as approx.
>> midyear
>> (1 real change made)
>>
>> . replace d_3 = "2012" if missing(d_3) // Missing year as most recent
>> full year
>> (1 real change made)
>>
>> .
>> . generate int imputed_dt = date(d_3 + d_2 + d_1, "YMD")
>>
>> . format imputed_dt %tdCCYY-NN-DD
>>
>> .
>> . generate str10 clean_dx = d_1 + "/" + d_2 + "/" + d_3
>>
>> . list dx clean_dx imputed_dt, noobs abbreviate(20)
>>
>>   +------------------------------------+
>>   |       dx     clean_dx   imputed_dt |
>>   |------------------------------------|
>>   | 01//2001   01/06/2001   2001-06-01 |
>>   | /01/2001   15/01/2001   2001-01-15 |
>>   |   01/01/   01/01/2012   2012-01-01 |
>>   +------------------------------------+
>>
>> .
>> . exit
>>
>> end of do-file
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> _DISCLAIMER:
> This email and any files transmitted with it are confidential. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received.
>
> The information contained in this e-mail may be subject to public disclosure under the Freedom of Information Act 2000. The confidentiality of this e-mail and your reply cannot be guaranteed, unless the information is legally exempt from disclosure.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index