Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Keep variables if a value is available for a specific date


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Keep variables if a value is available for a specific date
Date   Mon, 14 May 2012 09:21:12 +0100

No idea what IRR is, but I think I see what you mean.

You just modify the approach to keep the date used. The principles are
exactly the same as spelled out in the FAQ. Consider this fake
example:

clear
set obs 2000
gen id = 1 + (_n > 1000)
bysort id : gen date = mdy(12,31,2008) + _n
format date %td
gen price = ceil(100 * runiform())
replace price = . if runiform() < 0.9

gsort id -date
gen date2 = date
format date2 %td
by id : replace date2 = date2[_n-1] if missing(price)
by id : replace price = price[_n-1] if missing(price)

bysort id (date): list if day(date[_n+1]) == 1 &
inlist(month(date[_n+1]), 1, 4, 7, 10)

---------------------------------------------------------------------------------------------------------------------------------
-> id = 1

      +------------------------------------+
      | id        date   price       date2 |
      |------------------------------------|
  90. |  1   31mar2009      43   22apr2009 |
 181. |  1   30jun2009      40   30jun2009 |
 273. |  1   30sep2009      14   15oct2009 |
 365. |  1   31dec2009      30   10jan2010 |
 455. |  1   31mar2010      80   10apr2010 |
      |------------------------------------|
 546. |  1   30jun2010      15   05jul2010 |
 638. |  1   30sep2010      93   16oct2010 |
 730. |  1   31dec2010      98   17jan2011 |
 820. |  1   31mar2011      52   31mar2011 |
 911. |  1   30jun2011      79   30jun2011 |
      +------------------------------------+

---------------------------------------------------------------------------------------------------------------------------------
-> id = 2

      +------------------------------------+
      | id        date   price       date2 |
      |------------------------------------|
  90. |  2   31mar2009      63   31mar2009 |
 181. |  2   30jun2009      72   27jul2009 |
 273. |  2   30sep2009      23   02oct2009 |
 365. |  2   31dec2009      77   11jan2010 |
 455. |  2   31mar2010      11   09apr2010 |
      |------------------------------------|
 546. |  2   30jun2010      51   02jul2010 |
 638. |  2   30sep2010      17   03oct2010 |
 730. |  2   31dec2010      76   24jan2011 |
 820. |  2   31mar2011       6   02apr2011 |
 911. |  2   30jun2011      98   04jul2011 |
      +------------------------------------+




On Mon, May 14, 2012 at 8:33 AM, Tanja Berg <[email protected]> wrote:
> Thanks, Nick.
>
> I think I can't do it that way, because the correct date is essential to my further work. I want to do an IRR calculation afterwards.
> So if the price for e.g. the 31.03. is missing and the next available price is for the 01.04. I need the price for that day AND the correct date, the 01.04. If I would copy the price from 01.04. upwards to the 31.03. my IRR calculation will mess up.
>
> Is there a solution for this problem?
>
> Best regards
> Tanja
> -------------------
>
>
> This is a three-stage operation. First, create a flag for missing data
> so that you can keep track of what you filled in.
>
> gen ismissing(price)
>
> Second, fill in missing prices with the next available price using the
> procedure described at
>
> FAQ     . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
>       . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>       2/03    How can I replace missing values with previous or
>               following nonmissing values?
>               http://www.stata.com/support/faqs/data/missing.html
>
> Third, keep only the dates of interest.
>
> Nick
>
> On Sun, May 13, 2012 at 4:44 PM, Tanja Berg <[email protected]> wrote:
>
>> My data set looks like the following:
>>
>> ID              Date            Price
>> 1               01.01.2005      100
>> 1               02.01.2005      100,9
>> 1               03.01.2005      100,5
>> 1               04.01.2005      100,3
>> ...             ....                    ....
>> 2               01.01.2005      100
>> 2               02.01.2005      100,3
>> 2               03.01.2005      a. (missing)
>> 2               04.01.2005      100,4
>> ...             ....                    ....
>>
>>
>> For each ID I only want to keep the observations for the 31.03., 30.06., 30.09. and 31.12. of each year in my dataset. But if e.g. the price on the 31.03. is missing I want to keep the next possible available price.
>> My dataset is so large I can't look at each observation separately. So I think I will need some kind of loop that checks if the price e.g. for the 31.03. is available and then keeps the observation or if the price is not available, checks the 01.04. if there is a price available and then keeps this observation etc.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index