# st: RE: Spell-data Problem

 From "Nick Cox" To Subject st: RE: Spell-data Problem Date Fri, 29 Aug 2008 12:51:01 +0100

```Joerg has gaps between his episodes.

It is tempting to suggest that he needs to -reshape long-, fill in the
gaps, and -reshape- back again. But it can all be done in place.

Here is one way to do it. It's a little fiddly and leans very heavily on
-by:-.

local N = _N
bysort persnum (xbrr) : gen byte toexpand = ///
- ((xerr[_n-1] != (xbrr - 1)) & xerr[_n-1] < .)
expand 2 if toexpand
replace toexpand = 0 if _n > `N'
sort persnum xbrr toexpand
list
by persnum : replace xbrr = xerr[_n-1] + 1 if toexpand
by persnum : replace xerr = xbrr[_n+1] - 1 if toexpand
replace epiid = 9 if toexpand
list

Here's how that works with Joerg's example data.

. local N = _N

. bysort persnum (xbrr) : gen byte toexpand = ///
> - ((xerr[_n-1] != (xbrr - 1)) & xerr[_n-1] < .)

. expand 2 if toexpand
(2 observations created)

. replace toexpand = 0 if _n > `N'

. sort persnum xbrr toexpand

. list

+-----------------------------------------------------+
| persnum   xt   epiid      xbrr      xerr   toexpand |
|-----------------------------------------------------|
1. |    1001    3       1    2004m9   2005m12          0 |
2. |    1002    1       1    2004m7    2004m9          0 |
3. |    1002    2       2   2004m12    2005m7         -1 |
4. |    1002    2       2   2004m12    2005m7          0 |
5. |    1002    2       3    2005m8   2005m12          0 |
|-----------------------------------------------------|
6. |    1005    3       1    2004m6    2004m8          0 |
7. |    1005    2       2    2004m9   2005m12          0 |
8. |    1007    1       1    2004m7    2004m7          0 |
9. |    1007    3       2    2004m8   2004m10          0 |
10. |    1007    1       3   2004m11   2004m12          0 |
|-----------------------------------------------------|
11. |    1007    3       4    2005m1    2005m8          0 |
12. |    1007    2       5   2005m10   2005m12         -1 |
13. |    1007    2       5   2005m10   2005m12          0 |
14. |    1008    1       1    2004m9   2005m12          0 |
15. |    1010    1       1    2004m7    2004m8          0 |
|-----------------------------------------------------|
16. |    1010    2       2    2004m9   2005m12          0 |
+-----------------------------------------------------+

. by persnum : replace xbrr = xerr[_n-1] + 1 if toexpand

. by persnum : replace xerr = xbrr[_n+1] - 1 if toexpand

. replace epiid = 9 if toexpand

. list

+-----------------------------------------------------+
| persnum   xt   epiid      xbrr      xerr   toexpand |
|-----------------------------------------------------|
1. |    1001    3       1    2004m9   2005m12          0 |
2. |    1002    1       1    2004m7    2004m9          0 |
3. |    1002    2       9   2004m10   2004m11         -1 |
4. |    1002    2       2   2004m12    2005m7          0 |
5. |    1002    2       3    2005m8   2005m12          0 |
|-----------------------------------------------------|
6. |    1005    3       1    2004m6    2004m8          0 |
7. |    1005    2       2    2004m9   2005m12          0 |
8. |    1007    1       1    2004m7    2004m7          0 |
9. |    1007    3       2    2004m8   2004m10          0 |
10. |    1007    1       3   2004m11   2004m12          0 |
|-----------------------------------------------------|
11. |    1007    3       4    2005m1    2005m8          0 |
12. |    1007    2       9    2005m9    2005m9         -1 |
13. |    1007    2       5   2005m10   2005m12          0 |
14. |    1008    1       1    2004m9   2005m12          0 |
15. |    1010    1       1    2004m7    2004m8          0 |
|-----------------------------------------------------|
16. |    1010    2       2    2004m9   2005m12          0 |
+-----------------------------------------------------+

The -toexpand- variable is dispensable once it has served its purpose.
I've left it in to underline that I assigned it values of -1 (not 1 as
might seem customary) for observations to change. That is to ensure the
right sort order.

Nick
n.j.cox@durham.ac.uk

P.S. the logic of dealing with spells is dealt with in detail in

SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying
spells
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
Q2/07   SJ 7(2):249--265                                 (no
commands)
shows how to handle spells with complete control over
spell specification

P.P.S. Longstanding readers of this list will notice how I have resisted
allusions to Harry Potter, Hogwarts or Dumbledore.

P.P.P.S. Except in the previous footnote.

Joerg Eulenberger

i have an Spell-dataset like in the Table. My Problem is there Gap's
between the episode's.
For instance: Between the first and the second spell of Person 1002, are

two months missing. (2005m10-2005m11).
I want to create an separate Spell for the Missing time, like the other
spells, with a new status 9 and an time correct epiid.
For instance: after insert the new missing-spell, the missing Spell
should have the epiid-number 2 and following spells 3 and 4.
Persnum is the ID of Person. xt is the state, like job or unemployed.
xbrr is the begin of the spell, xerr is the end of spell and epiid is
the id of spell by persnum.

| persnum    xt      xbrr             xerr      epiid |
|--------------------------------------------|
1. |    1001    3    2004m9     2005m12       1 |
2. |    1002    1    2004m7     2004m9         1 |
3. |    1002    2    2004m12   2005m7         2 |
4. |    1002    2    2005m8     2005m12       3 |
5. |    1005    3    2004m6     2004m8         1 |
6. |    1005    2    2004m9     2005m12       2 |
7. |    1007    1    2004m7     2004m7         1 |
8. |    1007    3    2004m8     2004m10       2 |
9. |    1007    1    2004m11   2004m12       3 |
10. |    1007   3    2005m1     2005m8         4 |
11. |    1007   2    2005m10   2005m12       5 |
12. |    1008   1    2004m9     2005m12       1 |
13. |    1010   1    2004m7     2004m8         1 |
14. |    1010   2    2004m9     2005m12       2 |

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```