Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: capturing the sizes of the sequences of countinous (uninterrupted) values equal to 1


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: capturing the sizes of the sequences of countinous (uninterrupted) values equal to 1
Date   Wed, 30 Nov 2011 09:49:02 +0000

Toy example using -tsspell- (SSC).

clear
set obs 10
gen id = _n
forval j = 1/5 {
gen time`j' = runiform() < 0.7
  3. }

. l

     +--------------------------------------------+
     | id   time1   time2   time3   time4   time5 |
     |--------------------------------------------|
  1. |  1       1       1       1       0       0 |
  2. |  2       1       1       1       1       1 |
  3. |  3       1       1       1       1       1 |
  4. |  4       1       0       0       1       1 |
  5. |  5       1       1       0       1       0 |
     |--------------------------------------------|
  6. |  6       1       0       1       1       1 |
  7. |  7       1       1       1       0       1 |
  8. |  8       1       1       0       1       0 |
  9. |  9       1       1       1       1       1 |
 10. | 10       0       0       1       1       0 |
     +--------------------------------------------+

. reshape long time , i(id)
(note: j = 1 2 3 4 5)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                       10   ->      50
Number of variables                   6   ->       3
j variable (5 values)                     ->   _j
xij variables:
                  time1 time2 ... time5   ->   time
-----------------------------------------------------------------------------

. rename time state

. d

Contains data
  obs:            50
 vars:             3
 size:           650 (99.9% of memory free)
--------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
--------------------------------------------------------------------------------------------------
id              float  %9.0g
_j              byte   %9.0g
state           float  %9.0g
--------------------------------------------------------------------------------------------------
Sorted by:  id  _j
     Note:  dataset has changed since last saved

. rename _j time

. tsset id time
       panel variable:  id (strongly balanced)
        time variable:  time, 1 to 5
                delta:  1 unit

. l

     +-------------------+
     | id   time   state |
     |-------------------|
  1. |  1      1       1 |
  2. |  1      2       1 |
  3. |  1      3       1 |
  4. |  1      4       0 |
  5. |  1      5       0 |
     |-------------------|
  6. |  2      1       1 |
  7. |  2      2       1 |
  8. |  2      3       1 |
  9. |  2      4       1 |
 10. |  2      5       1 |
     |-------------------|
 11. |  3      1       1 |
 12. |  3      2       1 |
 13. |  3      3       1 |
 14. |  3      4       1 |
 15. |  3      5       1 |
     |-------------------|
 16. |  4      1       1 |
 17. |  4      2       0 |
 18. |  4      3       0 |
 19. |  4      4       1 |
 20. |  4      5       1 |
     |-------------------|
 21. |  5      1       1 |
 22. |  5      2       1 |
 23. |  5      3       0 |
 24. |  5      4       1 |
 25. |  5      5       0 |
     |-------------------|
 26. |  6      1       1 |
 27. |  6      2       0 |
 28. |  6      3       1 |
 29. |  6      4       1 |
 30. |  6      5       1 |
     |-------------------|
 31. |  7      1       1 |
 32. |  7      2       1 |
 33. |  7      3       1 |
 34. |  7      4       0 |
 35. |  7      5       1 |
     |-------------------|
 36. |  8      1       1 |
 37. |  8      2       1 |
 38. |  8      3       0 |
 39. |  8      4       1 |
 40. |  8      5       0 |
     |-------------------|
 41. |  9      1       1 |
 42. |  9      2       1 |
 43. |  9      3       1 |
 44. |  9      4       1 |
 45. |  9      5       1 |
     |-------------------|
 46. | 10      1       0 |
 47. | 10      2       0 |
 48. | 10      3       1 |
 49. | 10      4       1 |
 50. | 10      5       0 |
     +-------------------+

. tsspell, cond(state==1)

. l

     +------------------------------------------+
     | id   time   state   _seq   _spell   _end |
     |------------------------------------------|
  1. |  1      1       1      1        1      0 |
  2. |  1      2       1      2        1      0 |
  3. |  1      3       1      3        1      1 |
  4. |  1      4       0      0        0      0 |
  5. |  1      5       0      0        0      0 |
     |------------------------------------------|
  6. |  2      1       1      1        1      0 |
  7. |  2      2       1      2        1      0 |
  8. |  2      3       1      3        1      0 |
  9. |  2      4       1      4        1      0 |
 10. |  2      5       1      5        1      1 |
     |------------------------------------------|
 11. |  3      1       1      1        1      0 |
 12. |  3      2       1      2        1      0 |
 13. |  3      3       1      3        1      0 |
 14. |  3      4       1      4        1      0 |
 15. |  3      5       1      5        1      1 |
     |------------------------------------------|
 16. |  4      1       1      1        1      1 |
 17. |  4      2       0      0        0      0 |
 18. |  4      3       0      0        0      0 |
 19. |  4      4       1      1        2      0 |
 20. |  4      5       1      2        2      1 |
     |------------------------------------------|
 21. |  5      1       1      1        1      0 |
 22. |  5      2       1      2        1      1 |
 23. |  5      3       0      0        0      0 |
 24. |  5      4       1      1        2      1 |
 25. |  5      5       0      0        0      0 |
     |------------------------------------------|
 26. |  6      1       1      1        1      1 |
 27. |  6      2       0      0        0      0 |
 28. |  6      3       1      1        2      0 |
 29. |  6      4       1      2        2      0 |
 30. |  6      5       1      3        2      1 |
     |------------------------------------------|
 31. |  7      1       1      1        1      0 |
 32. |  7      2       1      2        1      0 |
 33. |  7      3       1      3        1      1 |
 34. |  7      4       0      0        0      0 |
 35. |  7      5       1      1        2      1 |
     |------------------------------------------|
 36. |  8      1       1      1        1      0 |
 37. |  8      2       1      2        1      1 |
 38. |  8      3       0      0        0      0 |
 39. |  8      4       1      1        2      1 |
 40. |  8      5       0      0        0      0 |
     |------------------------------------------|
 41. |  9      1       1      1        1      0 |
 42. |  9      2       1      2        1      0 |
 43. |  9      3       1      3        1      0 |
 44. |  9      4       1      4        1      0 |
 45. |  9      5       1      5        1      1 |
     |------------------------------------------|
 46. | 10      1       0      0        0      0 |
 47. | 10      2       0      0        0      0 |
 48. | 10      3       1      1        1      0 |
 49. | 10      4       1      2        1      1 |
 50. | 10      5       0      0        0      0 |
     +------------------------------------------+

.


On Wed, Nov 30, 2011 at 9:36 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> You can't get this information given your data structure into a single
> Stata variable. What you seek is a matrix.
>
> If w <= 244, you could try concatenating your variables into a string
> variable holding individuals' history.
>
> But I guess this would be easier after -reshape long-. Then a spell is
> defined as a sequence with all 1s for the same id. See then
>
> SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying spells
>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>        Q2/07   SJ 7(2):249--265                                 (no commands)
>        shows how to handle spells with complete control over
>        spell specification
>
> tsspell from http://fmwww.bc.edu/RePEc/bocode/t
>    'TSSPELL': module for identification of spells or runs in time series /
>    tsspell examines the data, which must be tsset time series, to / identify
>    spells or runs, which are contiguous sequences defined / by some
>    condition. tsspell generates new variables indicating / distinct spells,
>
> Nick
>
> On Wed, Nov 30, 2011 at 9:24 AM, massimiliano stacchini
> <mastacchini@yahoo.it> wrote:
>
>> I have a huge dataset. The rows identify the person ID (i) (i=1,...,n) while in columns there are the reference dates TIME(t) (t=1,...,w). Each cells contain the value 1 or 0 (zero), alternatively.
>>
>> I should create a variable (LENGTH) varying both over ID and TIME.
>> For each i of ID(i) in t of TIME(t), LENGTH should captures the number of continuous (uninterrupted) values which are equal to 1 in the interval of cells starting from the reference data t of TIME and moving backwards to the previous reference dates.
>> In other terms , LENGTH should capture for each (i) of ID and for each (t) of TIME the number of s in T (t-s) identifying cells having values equal to 1 (i.e., the size of the sequence of uninterrupted 1 moving backwards to the previous reference dates).
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index