Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: replacing missing time period data with next closest period


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: replacing missing time period data with next closest period
Date   Fri, 25 Mar 2011 17:12:03 +0000

I suggested a solution on 25 March for different hypothetical names. See
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1103/date/article-1026.html>

bysort state (date) : ipolate policy_on_X date, gen(policy_on_X_2)
replace policy_on_X_2  = cond(policy_on_X_2 == 0.5, ., round(policy_on_X_2))

Converting to Jieyu's names, I find that

bysort state (month) : ipolate option month, gen(option2) epolate
replace option2  = cond(option2 == 0.5, ., round(option2))

gives the same result as the code here, apart from that advertised:
Jieyu does not replace 0.5 with missing. Note the extra tweak of the
-epolate- option to extrapolate as Jieyu does. Extrapolation was not
part of the original specification but it's there for free.

I'd also comment that it is always prudent to keep a copy of the
original data, and not to overwrite it.

Nick

On Fri, Mar 25, 2011 at 4:48 PM, Jieyu Wang <[email protected]> wrote:
> Here is an example you may want to look at:
>  (this example replace current observation with previous period if it
> is in the middle)
>
> **********************
> clear
>
> set obs 24
> generate str10 state="Texas"
> replace  state="New york" in 13/24
> bysort state: generate month=_n
> generate option=.
> replace option=1 in 1
> replace option=0 in 2
> replace option=0 in 5
> replace option=1 in 10
> replace option=1 in 14/15
> replace option=0 in 19/20
> //prepare the data
>
> list
>
> sort state month
>
> bysort state: generate nomissing=_n if ! missing(option)
>
> generate flagbefore=nomissing
> bysort state: replace flagbefore=flagbefore[_n-1] if missing(flagbefore)
>
>
> bysort state: generate flaginverse=13-month
> sort state flaginverse
> generate flagafter=nomissing
> bysort state: replace flagafter=flagafter[_n-1] if missing(flagafter)
>
> drop flaginverse
> sort state month
> list
>
> bysort state: egen min=min(flagbefore)
> bysort state: egen max=max(flagafter)
> replace flagbefore=min if missing(flagbefore)
> replace flagafter=max if missing(flagafter)
> //generate flagbefore and flagafter to indicate the nearest nonmissing index
>
> sort state month
> list
>
> bysort state: replace option=cond( _n-flagbefore[_n]>flagafter-_n  ,
> option[flagafter[_n]], option[flagbefore[_n]] )
>  list
>  //replace within closest period
>
> ***********************
>
> 2011/3/16 Doug Hess <[email protected]>:
>> Hi.
>>
>> I'm hoping there are a set of commands that will help me edit cells
>> for this missing data problem.
>>
>> Each row in my dataset is a month for each state (and DC) over five
>> years (i.e., 51x5x12= 3,060 rows). The columns are binary values for
>> the reported presence of a policy that some states in some months have
>> implemented and others have not. Unfortunately, the months in which
>> states report the existence (or not) of the policy option varies by
>> year, and only the months when the option is reported have been coded.
>>
>> Thus, a state that reports not having selected a policy option in
>> April 2001 and then reports adopting the policy in August 2001, with
>> no reports in between, would have the value of 0 for April and 1 for
>> August, but missing "." for the other 10 months of the year (i.e.,
>> missing until the next report).
>>
>> I would like to fill in the months in between reports with the value
>> of the next closest month that has a value, and default to zero if the
>>  missing month is equidistant to a 0 or 1. I.e., in the example above,
>> assuming the policy had not changed before or after these reported
>> months, the cells from January thru June = 0, July thru December= 1.
>>
>> I'm starting to think I'll just do this by hand, but I wonder if
>> there's some nifty "if then...do" routine that can be created as I
>> would like to add more years and do this for a large number of policy
>> variables.
>>
>> If it matters: I'm using Stata 11 and don't have Stata programming
>> language experience (as opposed to issuing commands). My computer
>> programming skills stopped circa 1982 when I learned Basic on an Apple
>> IIe. So, if there's a choice between clunky commands and elegant
>> programming with the same results, I might go for the clunky commands.
>> But I am happy to learn if there's a resource I need to consult to
>> understand any advice or tips that are provided.
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index