Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values Date Sun, 11 Nov 2012 10:24:43 +0000

```I didn't try to understand all of what you want, but this might help.

Positive and negative prices will sort to opposite ends of blocks of
observations, so you can flag and count

bysort make mpg (price) : gen pos = price == price[_N] & price > 0
egen npos = total(pos), by(make mpg)

bysort make mpg (price) : gen neg = price == price[1] & price < 0
egen nneg = total(neg), by(make mpg)

Nick

On Sun, Nov 11, 2012 at 9:38 AM, Beatrice Benavidez
<beatricestata1711@gmail.com> wrote:

> I have this interesting problem where I would have the following dataset -
>
> make     price       mpg
> VW Diesel      5397        41
> BMW 320i      9735        25
> Datsun 510      5079        24
> Audi 5000      9690        17
> BMW 320i      -9735        25
> BMW 320i      9375        25
> BMW 320i      9375        25
> BMW 320i      9735        25
> BMW 320i      9735        25
> VW Diesel      - 5397       41
> BMW 320i      9735        25
>
> The dataset has opposite positive and negative price values for the
> common make and mpg (such as VW Diesel Price=5397 mpg=41 & VW Diesel
> Price=-5397 mpg=41) while at the same time there are duplicates for
> all make, price and mpg (BMW 320i Price=9375 mpg=25 appearing twice).
>
> The opposite positive and negative price values for the common make
> and mpg can also happen within duplicates based on all make, price and
> mpg (BMW 320i Price=9735 mpg=25 appearing 4 times & BMW 320i
> Price=-9735 mpg=25 appearing once).
>
> I know how to proceed with the identification and flagging of
> duplicate observations based on
> http://www.stata.com/support/faqs/data-management/duplicate-observations/
>
> I would like to be able to make a flag variable for both the opposite
> positive and negative price values for the common make and mpg, while
> only keeping one observation if there are duplicates for all make,
> price and mpg.
>
> At the same time, if there are 2 duplicated positive price values when
> there is one opposite negative price value for the common make and
> mpg, I would like to flag one positive price value observation and the
> opposite negative price value counterpart. Vice versa would apply if
> there are 2 duplicated negative price values and one opposite positive
> price value, I would want to flag one negative price value observation
> and the opposite positive price value observation.
>
> Expanding on this in the general case, if there are more duplicated
> positive price values than there are opposite negative price values
> for the common make and mpg (duplicated or not), I would like to flag
> all but one of the positive price value observation and (all) opposite
> negative price value observation(s) for the common make and mpg. Vice
> versa would apply if there are more duplicated negative price values
> than there are opposite positive price values for the common make and
> mpg.
>
> I would like to flag all but ONE of either positive or negative price
> value observations if the bigger number of duplicated sign groups are
> the positive or negative price values respectively.
>
> How should I proceed if I want to execute a flagging procedure for all
> these three different situations simultaneously without missing
> anything out?
>
> Any help will be appreciated!
>
> Thanks a lot!
>
>
> Beatrice
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```