Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: spells of missing values completely in between

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: Re: spells of missing values completely in between
Date	Sun, 8 Apr 2012 00:08:57 +0100

I don't know about elegant, but here is one approach.

First identify spells of missing values:

. tsspell, cond(missing(var2))

. l

     +------------------------------------+
     | var1   var2   _seq   _spell   _end |
     |------------------------------------|
  1. |    1      .      1        1      0 |
  2. |    2      .      2        1      0 |
  3. |    3      .      3        1      1 |
  4. |    4     56      0        0      0 |
  5. |    5      .      1        2      0 |
     |------------------------------------|
  6. |    6      .      2        2      0 |
  7. |    7      .      3        2      1 |
  8. |    8     95      0        0      0 |
  9. |    9      .      1        3      1 |
 10. |   10     20      0        0      0 |
     |------------------------------------|
 11. |   11      .      1        4      0 |
 12. |   12      .      2        4      1 |
     +------------------------------------+

Reclassify if the first spell of missing values is at the start.

. replace _spell = 0 if sum(missing(var2)) == _n
(3 real changes made)

Reverse time and do the same with the last spell (now the first).

. gsort -var1

. replace _spell = 0 if sum(missing(var2)) == _n
(2 real changes made)

Now _spell is positive if and only if you have a spell of missing
values in the middle.

. sort var1

. l

     +------------------------------------+
     | var1   var2   _seq   _spell   _end |
     |------------------------------------|
  1. |    1      .      1        0      0 |
  2. |    2      .      2        0      0 |
  3. |    3      .      3        0      1 |
  4. |    4     56      0        0      0 |
  5. |    5      .      1        2      0 |
     |------------------------------------|
  6. |    6      .      2        2      0 |
  7. |    7      .      3        2      1 |
  8. |    8     95      0        0      0 |
  9. |    9      .      1        3      1 |
 10. |   10     20      0        0      0 |
     |------------------------------------|
 11. |   11      .      1        0      0 |
 12. |   12      .      2        0      1 |
     +------------------------------------+

If it's important to you that the spells are numbered 1 up, you can
re-number them.

. egen spell = group(_spell) if _spell
(8 missing values generated)

. l

     +--------------------------------------------+
     | var1   var2   _seq   _spell   _end   spell |
     |--------------------------------------------|
  1. |    1      .      1        0      0       . |
  2. |    2      .      2        0      0       . |
  3. |    3      .      3        0      1       . |
  4. |    4     56      0        0      0       . |
  5. |    5      .      1        2      0       1 |
     |--------------------------------------------|
  6. |    6      .      2        2      0       1 |
  7. |    7      .      3        2      1       1 |
  8. |    8     95      0        0      0       . |
  9. |    9      .      1        3      1       2 |
 10. |   10     20      0        0      0       . |
     |--------------------------------------------|
 11. |   11      .      1        0      0       . |
 12. |   12      .      2        0      1       . |
     +--------------------------------------------+


See also http://www.stata.com/support/faqs/data/dropmiss.html
If memory serves me right, Gary Longton suggested the criterion
sum(missing(varname)) == _n for spells of missings at the beginning of
the data.


On Sat, Apr 7, 2012 at 12:01 PM, Abhimanyu Arora
<[email protected]> wrote:
> Somehow I feel the step in which the temp is generated can be omitted
> by a clever use of the -cond- option.
>
> On Sat, Apr 7, 2012 at 12:57 PM, Abhimanyu Arora
> <[email protected]> wrote:
>> Dear statalist
>> I was wondering if there is a more elegant solution to one below
>> involving SSC's tsspell by Nick Cox
>> . which tsspell
>> c:\ado\plus\t\tsspell.ado
>> *! 2.0.0 NJC 13 August 2002
>>
>> The aim is to create an indicator for spells of missing values
>> completely in between a series (excluding those at the beginning or
>> the end). (A part in the process of intrapolating a time series,
>> basically)
>>
>> I used the following set of commands.
>>
>>
>> clear
>> set obs 12
>> gen var1=_n
>> tsset var1
>> input var2
>> .
>> .
>> .
>> 56
>> .
>> .
>> .
>> 95
>> .
>> 20
>> .
>> end
>> tsspell var2
>> egen temp=max( _spell)
>> gen ind=(var2==. & _spell==1)|(var2==. & temp==_spell)| (var2!=.)
>>
>> Cheers
>> Abhimanyu
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Re: spells of missing values completely in between
  - From: Abhimanyu Arora <[email protected]>

References:
- st: spells of missing values completely in between
  - From: Abhimanyu Arora <[email protected]>
- st: Re: spells of missing values completely in between
  - From: Abhimanyu Arora <[email protected]>

Prev by Date: Re: st: calculating cumulative values of other observations
Next by Date: Re: st: Creating a variable: sum of past 5 years
Previous by thread: st: Re: spells of missing values completely in between
Next by thread: Re: st: Re: spells of missing values completely in between
Index(es):
- Date
- Thread