Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: replacing missing time period data with next closest period
From
Jieyu Wang <[email protected]>
To
[email protected]
Subject
Re: st: replacing missing time period data with next closest period
Date
Fri, 25 Mar 2011 11:48:10 -0500
Here is an example you may want to look at:
(this example replace current observation with previous period if it
is in the middle)
**********************
clear
set obs 24
generate str10 state="Texas"
replace state="New york" in 13/24
bysort state: generate month=_n
generate option=.
replace option=1 in 1
replace option=0 in 2
replace option=0 in 5
replace option=1 in 10
replace option=1 in 14/15
replace option=0 in 19/20
//prepare the data
list
sort state month
bysort state: generate nomissing=_n if ! missing(option)
generate flagbefore=nomissing
bysort state: replace flagbefore=flagbefore[_n-1] if missing(flagbefore)
bysort state: generate flaginverse=13-month
sort state flaginverse
generate flagafter=nomissing
bysort state: replace flagafter=flagafter[_n-1] if missing(flagafter)
drop flaginverse
sort state month
list
bysort state: egen min=min(flagbefore)
bysort state: egen max=max(flagafter)
replace flagbefore=min if missing(flagbefore)
replace flagafter=max if missing(flagafter)
//generate flagbefore and flagafter to indicate the nearest nonmissing index
sort state month
list
bysort state: replace option=cond( _n-flagbefore[_n]>flagafter-_n ,
option[flagafter[_n]], option[flagbefore[_n]] )
list
//replace within closest period
***********************
2011/3/16 Doug Hess <[email protected]>:
> Hi.
>
> I'm hoping there are a set of commands that will help me edit cells
> for this missing data problem.
>
> Each row in my dataset is a month for each state (and DC) over five
> years (i.e., 51x5x12= 3,060 rows). The columns are binary values for
> the reported presence of a policy that some states in some months have
> implemented and others have not. Unfortunately, the months in which
> states report the existence (or not) of the policy option varies by
> year, and only the months when the option is reported have been coded.
>
> Thus, a state that reports not having selected a policy option in
> April 2001 and then reports adopting the policy in August 2001, with
> no reports in between, would have the value of 0 for April and 1 for
> August, but missing "." for the other 10 months of the year (i.e.,
> missing until the next report).
>
> I would like to fill in the months in between reports with the value
> of the next closest month that has a value, and default to zero if the
> missing month is equidistant to a 0 or 1. I.e., in the example above,
> assuming the policy had not changed before or after these reported
> months, the cells from January thru June = 0, July thru December= 1.
>
> I'm starting to think I'll just do this by hand, but I wonder if
> there's some nifty "if then...do" routine that can be created as I
> would like to add more years and do this for a large number of policy
> variables.
>
> If it matters: I'm using Stata 11 and don't have Stata programming
> language experience (as opposed to issuing commands). My computer
> programming skills stopped circa 1982 when I learned Basic on an Apple
> IIe. So, if there's a choice between clunky commands and elegant
> programming with the same results, I might go for the clunky commands.
> But I am happy to learn if there's a resource I need to consult to
> understand any advice or tips that are provided.
>
> Thank you.
>
> Doug
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/