Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: AW: Area under a percentile point


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: AW: Area under a percentile point
Date   Tue, 3 Feb 2009 19:20:50 -0000

I'll echo Jeph's comment that this is all a bit strange, and add a comment that without further justification it does not sound good practice to me. 

In addition to other problems, note that which of the various observations at the high end are kept is arbitrary and not reproducible without further devices. That may have consequences for other variables, as even though the variable in focus may have several tied values at the high end, other variables may well differ. 

Tagging what you want to work with by an indicator variable, rather than -keep-ing those observations only, would be a step in the right direction. 

Nick 
n.j.cox@durham.ac.uk 

Antoine Terracol

Just make sure to drop all observations with missing values of myvar 
before applying Jeph's trick


zeynep elitaº wrote:

> It looks like it'll work just fine
> 
> 2009/2/3 Jeph Herrin <junk@spandrel.net>:

>> What you want is a bit strange, but here is how I would do it:
>>
>>  sort myvar
>>  keep if _n<0.75*_N

>> zeynep elitaº wrote:

>>> Thanks for help, but this does not solve my problem.  As a simple
>>> example, lets assume we have a data set containing five 6's and five
>>> 7's. so the sample size is 10. Now if I wanted to keep the data values
>>> below the 75 th percentile, I would roughly have 7 observations
>>> remaining. However, following your suggestion I end up keeping all the
>>> observations.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index