Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Beatrice Benavidez <beatricestata1711@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values |

Date |
Mon, 26 Nov 2012 14:16:36 +0400 |

I think I found it. *** Number positive and negative duplicate values of make, mpg, price: bys make mpg price: gen dupid = _n *** Mark pairs of positive and negative price opposite/counterpart duplicate valaues by using absolute values of price: gen absprice = abs(price) egen dupid_g = group(dupid) su dupid_g, meanonly * Flag these Pos & Neg absolute price duplicates * G will stata which posneg counterpart it is whether it's the first or 2nd or 3rd so on gen flag_posneg = . forvalues g = 1/`r(max)' { duplicates tag make mpg absprice if dupid_g==`g' , gen(posneg_`g') replace flag_posneg = 1 if posneg_`g'==1 } drop posneg_* dupid_g dupid absprice *** Mark Positive Price Duplicates or Negative Price Duplicates stripped of pos neg price opp/counterpart dup values * Prices stripped of positive/negative price duplicates gen price_dup = price if flag_posneg!=1 * Duplicates stripped of positive/negative price duplicates bys make mpg price_dup: gen dupid = _n if (flag_posneg==.) gen flag_dup = 1 if (dupid>1 & dupid!=.) drop price_dup dupid * Putting flag_posneg & flag_dup together gen flag_posneg_dup = 1 if ( flag_posneg == 1 | flag_dup == 1 ) Beatrice You could try something like this: * Number positive and negative duplicate values of price independently: bys make mpg price: gen dupid = _n * Mark pairs by absolute value of price gen absprice = abs(price) duplicates tag make mpg absprice dupid , gen(dup_pair) * Look for unpaired duplicates duplicates tag make mpg price if dup_pair==0 , gen(dup_nonpair) I'm not sure which of these you want to keep/drop, but I think this would identify your three different groups: 1. unique: dupid==1 --or-- dup_pair==0 & dup_nonpair==0 2. pos/neg paired: dup_pair==1 3. additional pos or neg unpaired duplicates: dup_nonpair==1 Mike On Sun, Nov 11, 2012 at 6:42 AM, daniel klein <klein.daniel.81@gmail.com> wrote: > > First of, I am sorry for reposting, but the last message got corrupted > in the archive (broke into two pieces and omitting the middle part). > Here is the second (and final) try: > > Beatrice, > > this is kind of confusing. You say, you want to > > "[...] keep[ing] one observation if there are duplicates for all make, > price and mpg." > > You then go on, specifying rules for cases in which > > "there are 2 duplicated positive price values when there is one > opposite negative price value for the common make and mpg" > > But this is impossible. Given the first step, which elimintates all > but one positive (or negative) price value in the subgroup defined by > make and mpg, there can no longer be any cases that have 2 (or more) > duplicated positive (or negative) price values in terms of make and > mpg. > > From your description it further seems to be arbitrary which > observations with positive or negative price values to flag. But in > this case, why worry about positve and negative price values at all, > when the only difference in these observations seem to be the > multiplier (-1)? > > It is not that I mind playing a round with Stata -- on the contrary. > But it migth help us help you, if you could comment on these > statments, elaborate a little bit on the sequence of steps you want to > take here, and maybe be more specific about your ultimate goal. An > example dataset containing all the possibilities you have in mind > would also be nice (only if your first example lacks any possible > situation you want to tackle). > > Best > Daniel > > -- > Dear All, > > [...] > I would like to be able to make a flag variable for both the opposite > positive and negative price values for the common make and mpg, while > only keeping one observation if there are duplicates for all make, > price and mpg. > > At the same time, if there are 2 duplicated positive price values when > there is one opposite negative price value for the common make and > mpg, I would like to flag one positive price value observation and the > opposite negative price value counterpart. Vice versa would apply if > there are 2 duplicated negative price values and one opposite positive > price value, I would want to flag one negative price value observation > and the opposite positive price value observation. > > Expanding on this in the general case, > [...] > > Thanks a lot! > > Beatrice > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ -- Michael Barker Department of Economics Georgetown University Washington, DC 20057 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: comparing coefficients across 2 models** - Next by Date:
**Re: st: exploratory factor analysis with dichotomous and continuous data** - Previous by thread:
**Re: st: RE: Drop Duplicates while simultaneously eliminating opposite positive and negative values** - Next by thread:
**Re: st: Increase memory allocation on 64-bit machine with 8GB of Ram** - Index(es):