Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: RE: st: questions about duplicate observations


From   "Wen Xia Ge" <wenxia.ge@mail.mcgill.ca>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: RE: st: questions about duplicate observations
Date   Tue, 27 May 2008 22:14:31 -0400

Thanks a lot, Nick!  "keep if max == offering_amt" works. And you are right, I need to watch out for ties, because they do exist in the dataset. 
 
Your suggestion on how to sort in descending order is very useful! 

Wenxia

________________________________

From: owner-statalist@hsphsun2.harvard.edu on behalf of n j cox
Sent: Tue 5/27/2008 12:12 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: RE: st: questions about duplicate observations



You may not need -collapse- at all.

keep if max == offering_amt

may be sufficient once you have calculated all your new variables.

But watch out for ties.

You can always sort in descending order. Just negate the variable in
question first.

gen negfoo = -foo
sort negfoo
bysort frog negfoo : ...

Nick
n.j.cox@durham.ac.uk

"Wen Xia Ge" <wenxia.ge@mail.mcgill.ca>

Thanks for your suggestion. Your suggestion works well for the second
approach. But I still do not figure out how to use -collapse- to get the
dataset described in the first approach. That is, for firms with
multiple bond issues in a year, I just want to keep the issue with the
largest offering_amt (firms with single bond issue will remain in the
dataset). I tried the following:

  bysort yeara cnum (offering_amt) : gen max = offering_amt[_N]
  collapse max bond_yield maturity (and some other variables which are
not listed here), by(yeara cnum)

It will give the means of the listed variables. In this case, max is OK
(it is the largest offering_amt), but I want to keep the orignial
bond_yield, maturity etc associated with the issue with the largest
offering_amt. e.g., for issue 7, 8 and 9, issue 7 and 8 should be
removed, and just variables associated with issue 9 will be remained in
the dataset.

I tried to use -duplicates drop-, but I cannot sort data in descending
order, because the error message says -gsort- cannot be combined with
-by-.


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index