# Thank you Nick! (re: st: preserving missing values in collapse (sum))

 From Melonie Sullivan <[email protected]> To [email protected] Subject Thank you Nick! (re: st: preserving missing values in collapse (sum)) Date Thu, 25 Oct 2007 08:44:54 -0700 (PDT)

```Thank you so much for your assistance (and your
patience). With -fillin-, setting duration=0 when
_fillin=1, and then -reshape-, I got exactly what I
needed. Whew!

Melonie

--- n j cox <[email protected]> wrote:

> This response y has not so far appeared in the
> drama.
> Where does it come from? The same dataset?
>
> Either way, I think you might make progress by
> checking out -reshape-.
>
> Melonie Sullivan
>
> Okay, so far so good, thanks. But now how do I get
> that information into a matrix of this form - one
> line
> for each youth:
>
> youthid y x1 x2 x3 x4 x5 x6
> 11      0 15 41  0  0  .  0
> 12      1  0  13 0  42 0 55
>
> where y=dependent variable, x1=duration if group=1,
> x2=duration if group=2, etc. If I take your
> solution,
> then generate x1, x2...., and do a -list- I still
> get
> a 6x6 matrix of x for each youth that looks like
> this:
>
> youthid x1  x2  x3  x4  x5  x6
> 11      15   .   .   .   .   .
> 11      .   41   .   .   .   .
> 11      .    .   .   .   .   .
> 11      .    .   .   .   .   .
> 11      .    .   .   .   .   .
> 11      .    .   .   .   .   .
>
> < intermediate posts>
>
>  > >> I have data on history of placements into
>  > different
>  > >> -group-s by -youthid-: there are multiple
>  > placement
>  > >> records for each youth. I need to create a
>  > variable
>  > >> equal to the sum of -duration- of all
> placements
>  > into
>  > >> each -group- for each youth. -collapse (sum)-
>  > seems to be
>  > >> the appropriate procedure, but it treats
> missing
>  > >> values as zeroes. This causes a problem if a
>  > youth has
>  > >> only one placement in a given group with
> unknown
>  > >> duration. Example:
>  > >>
>  > >>
>  > >> youthid         group        duration
>  > >> 11                 1            15
>  > >> 11                 1             .
>  > >> 11                 2            31
>  > >> 11                 2            10
>  > >> 11                 5             .
>  > >> 12                 2             5
>  > >> 12                 2             8
>  > >> 12                 4            42
>  > >> 12                 6            55
>  > >>
>  > >> I create a duration variable for each group
>  > (-generate
>  > >> grp1dur = duration if group==1-, etc.) and
>  > -collapse
>  > >> (sum)- by -youthid- and I want to get this:
>  > >>
>  > >> youthid   11    12
>  > >> grp1dur   15     0
>  > >> grp2dur   41    13
>  > >> grp3ddur   0     0
>  > >> grp4dur    0    42
>  > >> grp5dur    .     0
>  > >> grp6dur    0    55
>  > >>
>  > >> But collapse gives me a zero on grp5dur for
> youth
>  > #11,
>  > >> though youth #11 had placement in that group,
>  > albeit
>  > >> of an unknown duration. The other zeroes are
>  > correct;
>  > >> the youth had zero days in that placement
> group.
>  > >>
>  > >> The problem has been addressed here before,
> best
>  > in
>  > >> the following post by Nick Cox:
>  > >>
>  > >>
>  >
>
http://www.stata.com/statalist/archive/2004-07/msg00783.html
>  > >>
>  > >> However, this is not solving my particular
>  > problem,
>  > >> because my data essentially looks like a big
>  > stack of
>  > >> Nick's "toy datasets" -- one for each of 1800
>  > youth in
>  > >> my data. So collapsing by (youthid) gives the
>  > same
>  > >> value of Nick's allmissing for each youth,
> since
>  > the
>  > >> allmissing tags missing durations for groups
>  > within
>  > >> youths.
>
