[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Spell-data Problem

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Spell-data Problem
Date	Fri, 29 Aug 2008 12:51:01 +0100
Joerg has gaps between his episodes.  

It is tempting to suggest that he needs to -reshape long-, fill in the
gaps, and -reshape- back again. But it can all be done in place. 

Here is one way to do it. It's a little fiddly and leans very heavily on
-by:-. 

local N = _N 
bysort persnum (xbrr) : gen byte toexpand = ///
- ((xerr[_n-1] != (xbrr - 1)) & xerr[_n-1] < .) 
expand 2 if toexpand 
replace toexpand = 0 if _n > `N' 
sort persnum xbrr toexpand  
list 
by persnum : replace xbrr = xerr[_n-1] + 1 if toexpand
by persnum : replace xerr = xbrr[_n+1] - 1 if toexpand
replace epiid = 9 if toexpand 
list

Here's how that works with Joerg's example data. 

. local N = _N 

. bysort persnum (xbrr) : gen byte toexpand = ///
> - ((xerr[_n-1] != (xbrr - 1)) & xerr[_n-1] < .) 

. expand 2 if toexpand 
(2 observations created)

. replace toexpand = 0 if _n > `N' 
(2 real changes made)

. sort persnum xbrr toexpand  

. list 

     +-----------------------------------------------------+
     | persnum   xt   epiid      xbrr      xerr   toexpand |
     |-----------------------------------------------------|
  1. |    1001    3       1    2004m9   2005m12          0 |
  2. |    1002    1       1    2004m7    2004m9          0 |
  3. |    1002    2       2   2004m12    2005m7         -1 |
  4. |    1002    2       2   2004m12    2005m7          0 |
  5. |    1002    2       3    2005m8   2005m12          0 |
     |-----------------------------------------------------|
  6. |    1005    3       1    2004m6    2004m8          0 |
  7. |    1005    2       2    2004m9   2005m12          0 |
  8. |    1007    1       1    2004m7    2004m7          0 |
  9. |    1007    3       2    2004m8   2004m10          0 |
 10. |    1007    1       3   2004m11   2004m12          0 |
     |-----------------------------------------------------|
 11. |    1007    3       4    2005m1    2005m8          0 |
 12. |    1007    2       5   2005m10   2005m12         -1 |
 13. |    1007    2       5   2005m10   2005m12          0 |
 14. |    1008    1       1    2004m9   2005m12          0 |
 15. |    1010    1       1    2004m7    2004m8          0 |
     |-----------------------------------------------------|
 16. |    1010    2       2    2004m9   2005m12          0 |
     +-----------------------------------------------------+

. by persnum : replace xbrr = xerr[_n-1] + 1 if toexpand
(2 real changes made)

. by persnum : replace xerr = xbrr[_n+1] - 1 if toexpand
(2 real changes made)

. replace epiid = 9 if toexpand 
(2 real changes made)

. list 

     +-----------------------------------------------------+
     | persnum   xt   epiid      xbrr      xerr   toexpand |
     |-----------------------------------------------------|
  1. |    1001    3       1    2004m9   2005m12          0 |
  2. |    1002    1       1    2004m7    2004m9          0 |
  3. |    1002    2       9   2004m10   2004m11         -1 |
  4. |    1002    2       2   2004m12    2005m7          0 |
  5. |    1002    2       3    2005m8   2005m12          0 |
     |-----------------------------------------------------|
  6. |    1005    3       1    2004m6    2004m8          0 |
  7. |    1005    2       2    2004m9   2005m12          0 |
  8. |    1007    1       1    2004m7    2004m7          0 |
  9. |    1007    3       2    2004m8   2004m10          0 |
 10. |    1007    1       3   2004m11   2004m12          0 |
     |-----------------------------------------------------|
 11. |    1007    3       4    2005m1    2005m8          0 |
 12. |    1007    2       9    2005m9    2005m9         -1 |
 13. |    1007    2       5   2005m10   2005m12          0 |
 14. |    1008    1       1    2004m9   2005m12          0 |
 15. |    1010    1       1    2004m7    2004m8          0 |
     |-----------------------------------------------------|
 16. |    1010    2       2    2004m9   2005m12          0 |
     +-----------------------------------------------------+

The -toexpand- variable is dispensable once it has served its purpose.
I've left it in to underline that I assigned it values of -1 (not 1 as
might seem customary) for observations to change. That is to ensure the
right sort order. 

Nick 
[email protected] 

P.S. the logic of dealing with spells is dealt with in detail in 

SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying
spells
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q2/07   SJ 7(2):249--265                                 (no
commands)
        shows how to handle spells with complete control over
        spell specification

P.P.S. Longstanding readers of this list will notice how I have resisted
allusions to Harry Potter, Hogwarts or Dumbledore. 

P.P.P.S. Except in the previous footnote. 

Joerg Eulenberger

i have an Spell-dataset like in the Table. My Problem is there Gap's 
between the episode's.
For instance: Between the first and the second spell of Person 1002, are

two months missing. (2005m10-2005m11).
I want to create an separate Spell for the Missing time, like the other 
spells, with a new status 9 and an time correct epiid.
For instance: after insert the new missing-spell, the missing Spell 
should have the epiid-number 2 and following spells 3 and 4.
Persnum is the ID of Person. xt is the state, like job or unemployed. 
xbrr is the begin of the spell, xerr is the end of spell and epiid is 
the id of spell by persnum.

     | persnum    xt      xbrr             xerr      epiid |
     |--------------------------------------------|
  1. |    1001    3    2004m9     2005m12       1 |
  2. |    1002    1    2004m7     2004m9         1 |
  3. |    1002    2    2004m12   2005m7         2 |
  4. |    1002    2    2005m8     2005m12       3 |
  5. |    1005    3    2004m6     2004m8         1 |
  6. |    1005    2    2004m9     2005m12       2 |
  7. |    1007    1    2004m7     2004m7         1 |
  8. |    1007    3    2004m8     2004m10       2 |
  9. |    1007    1    2004m11   2004m12       3 |
 10. |    1007   3    2005m1     2005m8         4 |
 11. |    1007   2    2005m10   2005m12       5 |
 12. |    1008   1    2004m9     2005m12       1 |
 13. |    1010   1    2004m7     2004m8         1 |
 14. |    1010   2    2004m9     2005m12       2 |


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: RE: Spell-data Problem
  - From: Joerg Eulenberger <[email protected]>
References:
- st: Spell-data Problem
  - From: Joerg Eulenberger <[email protected]>
Prev by Date: RE: st: Vuong test statistics
Next by Date: st: update propcnsreg available
Previous by thread: st: Spell-data Problem
Next by thread: Re: st: RE: Spell-data Problem
Index(es):
- Date
- Thread