Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Collapse & Missing Values


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Re: Collapse & Missing Values
Date   Wed, 28 Sep 2005 22:38:22 +0100

I think this needs a tweak: 

bysort date (sum): keep if _n == 1 

will ensure that the first value of
-sum- in each group after sorting is missing 
if and only if -sum- is missing 
on all values in each group. With the 
code as is stands you could lose sums 
you want to keep. 

Nick 
n.j.cox@durham.ac.uk 

Friedrich Huebler
 
> Eric,
> 
> Here is one way to preserve the missing value.
> 
> . bysort date: egen sum = sum(amount)
> . replace sum = . if amount==.
> (2 real changes made, 2 to missing)
> . bysort date: keep if _n==1
> (3 observations deleted)
> . drop amount
> . rename sum amount
> . clist, noobs
> 
>        date     amount
> 10-Oct-1990        189
> 11-Oct-1990          .
> 12-Oct-1990        107
> 
> Friedrich Huebler
> 
> --- "Eric G. Wruck" <ewruck@econalytics.com> wrote:
> > I just learned, rather inconveniently, that collapse doesn't work
> > the 
> > way I'd like when encountering missing values.  Here's an example:
> > . l
> > 
> >       +----------------------+
> >       |        date   amount |
> >       |----------------------|
> >    1. | 10-Oct-1990      200 |
> >    2. | 10-Oct-1990      -75 |
> >    3. | 10-Oct-1990       64 |
> >    4. | 11-Oct-1990        . |
> >    5. | 12-Oct-1990      107 |
> >       |----------------------|
> >    6. | 12-Oct-1990        . |
> >       +----------------------+
> > 
> > . collapse (sum) net_amt=amount, by(date)
> > 
> > . l
> > 
> >       +-----------------------+
> >       |        date   net_amt |
> >       |-----------------------|
> >    1. | 10-Oct-1990       189 |
> >    2. | 11-Oct-1990         0 |
> >    3. | 12-Oct-1990       107 |
> >       +-----------------------+
> > 
> > .
> > The problem is for the single 11-Oct-1990 observation.  After 
> > collapsing, the missing value becomes a zero; in this instance I 
> > would have preferred it remain missing.  The 12-Oct-1990 treatment
> > is 
> > fine & what I expected.  I suppose I could delete observations
> > before 
> > performing the collapse but it would be better if there was some 
> > other option.  Is there?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index