Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: RE: Re: eliminate negative values and their positive counterpart


From   "Sergiy Radyakin" <Radyakin@aoek.uni-hannover.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: RE: Re: eliminate negative values and their positive counterpart
Date   Tue, 6 Mar 2007 22:53:10 +0100

Sorry, (my mistake) this line should be simply omitted. Both npos and nneg are defined later on.
I guess the solution proposed by Mr Blasnik will exclude observations if the sum
over a group is zero, but should it be a necessary condition?

30
-30
30

(with all other variables being same) -- e.g. bought apples for 30, than returned them to the cashier, than decided to buy them again :)
This should simplify to 30, but the sum over the group is not 0.

it is a pity that I can't do as I have written, i.e. -count if- after -egen-

Sincerely yours,
Sergiy Radyakin



----- Original Message ----- From: "Nick Cox" <n.j.cox@durham.ac.uk>
To: <statalist@hsphsun2.harvard.edu>
Sent: Tuesday, March 06, 2007 10:20 PM
Subject: st: RE: Re: eliminate negative values and their positive counterpart



I stopped at this line. There is a typo there
somewhere, as the code is illegal:

by t abs_charge:egen npos=count if charge>0

Nick
n.j.cox@durham.ac.uk

Sergiy Radyakin

you will get a dozen of ways to do this more efficiently in
several minutes,
but here is a straightforward way:

Assume your data is:
    +------------+
     | t   charge |
     |------------|
  1. | 1       10 |
  2. | 2       20 |
  3. | 2       20 |
  4. | 3       30 |
  5. | 3       20 |
     |------------|
  6. | 3      -30 |
  7. | 3      -30 |
  8. | 4       10 |
  9. | 4       20 |
 10. | 4       30 |
     |------------|
 11. | 4       40 |
 12. | 5       10 |
 13. | 5       20 |
 14. | 5      -10 |
 15. | 5      -10 |
     |------------|
 16. | 5       10 |
 17. | 5       10 |
 18. | 5       20 |
 19. | 5       10 |
 20. | 5       10 |
     |------------|
 21. | 5       10 |
 22. | 5      -10 |
 23. | 5       20 |
 24. | 5      -20 |
     +------------+


Then



gen abs_charge=abs(charge)
sort t abs_charge
by t abs_charge:egen npos=count if charge>0

gen posit=1 if charge>0
gen negat=1 if charge<0

by t abs_charge:egen npos=count(posit)
by t abs_charge:egen nneg=count(negat)

by t abs_charge:gen ndel=npos*(npos<nneg)+nneg*(nneg<npos)

by t abs_charge:drop if _n<ndel | _n>_N-ndel


Or something similar? change t as appropriate to define a group.
sara borelli

> This is an exctract of my data:
>
> personid       charge     proc       date
> 1000124 +13   80048    6/6/2001
> 1000124 +13   80076    6/6/2001
> ...
> 1000124 +13   80048    6/7/2001
> 1000124 +13   80076    6/7/2001
> ...
> 1000124 -13   80048    6/7/2001
> 1000124 -13   80076    6/7/2001
> ...
> 1000124 +13   80048    6/8/2001
> ...
> 1000124 +13   80048    6/9/2001
> ...
> 1000124 +13   81001    6/11/2001
> ...
> 1000124 +13   80048    6/11/2001
> 1000124 +13   80048    6/12/2001
>
> where the dots indicate that other values for the same
> personid are in between. I need to eliminate the
> negative charges AND their positive counterpart with
> the same proc and date. Thus, for example I need to
> eliminate the negative
> -13 80048    6/7/2001
> AND its positive counterpart with the same proc and
> date:
> +13   80048    6/7/2001,
> and so on.
> I should find a way to construct an algorithm that
> identifies and eliminates the negatives AND their
> poistive counterpart with he same date and procedure,
> but I cannot figure that out.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index