Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: difficulty recoding variable by referring to prior and subsequent lines in panel data


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: difficulty recoding variable by referring to prior and subsequent lines in panel data
Date   Fri, 8 Mar 2013 11:46:28 +0000

The last code segment should have been

clear
set obs 50
egen id = seq(), block(25)
egen t = seq(), to(25)
gen y = 1 + mod(_n, 2)
tsset id t
gen y2 = y
replace y2 = L.y if (L.y == F.y) & (y != F.y) & !missing(L.y, F.y)
twoway connected y y2 t , c(L L) by(id, col(1))

I copied and pasted the wrong segment.

Nick


On Fri, Mar 8, 2013 at 11:21 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Working backwards:
>
> 1. References using subscripts such as [_n-1] and [_n+1] are
> completely independent of whether -tsset- has been used previously.
>
> If it were otherwise, then users -- at any level -- would need to keep
> track of whether the data were currently -tsset- (something that could
> have happened in a previous session with Stata) in using subscript
> references. Similarly, anyone reading a code segment using subscript
> references could not interpret those references correctly without
> knowing about a prior -tsset-. Finally, programmers would have to take
> account of whether a -tsset- was in force in using subscripts, as the
> effects might be different.
>
> The more you think about it, the more it is clear that this would be a
> bad idea, but it is not how Stata works, which is good news.
>
> 2. The problem stated could be tackled with subscripts so long as -by
> <panel identifier> (<time identifier>):- were the framework, say
>
> bysort panelid (time) :
>
> 3. But it is easiest just to exploit the scope for using time series
> operators that -tsset- implies. See below.
>
> 4. The effects of subscript references and time series operators will
> differ in time series with gaps.
>
> I haven't tried to follow this code, which seems rather tangled, but
> trust that this self-contained example will prove instructive
>
> . clear
>
> . set obs 10
> obs was 0, now 10
>
> . egen id = seq(), block(5)
>
> . egen t = seq(), to(5)
>
> . gen y = 1 + mod(_n, 2)
>
> . l
>
>      +------------+
>      | id   t   y |
>      |------------|
>   1. |  1   1   2 |
>   2. |  1   2   1 |
>   3. |  1   3   2 |
>   4. |  1   4   1 |
>   5. |  1   5   2 |
>      |------------|
>   6. |  2   1   1 |
>   7. |  2   2   2 |
>   8. |  2   3   1 |
>   9. |  2   4   2 |
>  10. |  2   5   1 |
>      +------------+
>
> . tsset id t
>        panel variable:  id (strongly balanced)
>         time variable:  t, 1 to 5
>                 delta:  1 unit
>
> . gen y2 = y
>
> . replace y2 = L.y if (L.y == F.y) & (y != F.y) & !missing(L.y, F.y)
> (6 real changes made)
>
> . l
>
>      +-----------------+
>      | id   t   y   y2 |
>      |-----------------|
>   1. |  1   1   2    2 |
>   2. |  1   2   1    2 |
>   3. |  1   3   2    1 |
>   4. |  1   4   1    2 |
>   5. |  1   5   2    2 |
>      |-----------------|
>   6. |  2   1   1    1 |
>   7. |  2   2   2    1 |
>   8. |  2   3   1    2 |
>   9. |  2   4   2    1 |
>  10. |  2   5   1    1 |
>      +-----------------+
>
> Here the principles suggested are
>
> 1. In general, always smooth a copy of the variable concerned.
>
> 2. Alison wants to test
>
> if previous and following values are the same: use L. and F. operators
>
> but different from the present value (this condition appears
> redundant, but does no harm)
>
> but to avoid beginning and ends of each panel (not using observations
> for which either or both previous and following values are missing
> appears sufficient to avoid this).
>
> 3. My concocted example also points up an instability in the method of
> "smoothing", perhaps better seen from
>
> clear
> set obs 10
> egen id = seq(), block(5)
> egen t = seq(), to(5)
> gen y = 1 + mod(_n, 2)
> l
> tsset id t
> gen y2 = y
> replace y2 = L.y if (L.y == F.y) & (y != F.y) & !missing(L.y, F.y)
> l
>
> On the other hand it may work well for smoothing isolated anomalies.
>
> Nick
>
> On Fri, Mar 8, 2013 at 3:27 AM, Alison El Ayadi <alisonelayadi@yahoo.com> wrote:
>
>> I am having an issue where I am working with a binary variable in panel data and want to recode the variable at a particular time point to the opposite condition if the value at the prior and subsequent time points are equal to each other but not equal to the reference time point.
>>
>> I initially used this code:
>>   *tsset data
>> sort pt_id index_cl index_rh
>>          by pt_id: replace n_num = _n
>>          sort pt_id n_num
>>          tsset pt_id n_num
>>
>>    *take care of 'random' differences
>>     gen low_index_recode_ind = 1 if (low_index!=low_index[_n+1]) & (low_index!=low_index[_n-1]) & (low_index[_n-1]==low_index[_ n+1]) & low_index!=. & low_index[_n-1]!=.
>> gen low_index_new = low_index
>>          replace low_index_new = 1 if low_index_recode_ind==1 & low_index==0
>>          \replace low_index_new = 0 if low_index_recode_ind==1 & low_index==1
>>
>>
>>
>> And I found that when recoding a number of first and last lines were recoded.  So I made a change to include conditions in bold:
>>  gen low_index_recode_ind = 1 if (low_index!=low_index[_n+1]) & (low_index!=low_index[_n-1]) & (low_index[_n-1]==low_index[_n+1]) & low_index!=. & low_index[_n-1]!=. & n_num!=1 & n_num!=_N
>> However I am still finding that there are at least several final lines that have been recoded.
>>
>> I understand by tsset-ing the data Stata would not be comparing the lines belonging to one pt_id to lines from the next pt_id in dataset sequence, am I correct in this assumption, and if so, does anyone see any mistakes in my code which will allow this change to operate successfully as intended for all lines, not just middle lines?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index