Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Inverse Cummulative Variable

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	RE: st: Inverse Cummulative Variable
Date	Mon, 12 Mar 2012 18:07:11 +0000

This can be shortened. Let me recap for the very small number of people still reading.

The original question can be rephrased: I want complementary cumulative frequencies, which are #(>= value). 

Using -mpg- 

. sysuse auto, clear 

(1) Adopting -cumul- gives 

gen work = -mpg 
cumul work , gen(cufreq) freq equal

(2) Adopting -egen-'s -rank()- gives. 

egen work = rank(-mpg), unique
egen cufreq = max(work), by(mpg)

The shortening is possible because the -rank()- function of -egen- can take expressions. 

After (1) or (2), a convenient tabulation is 

tabdisp mpg, c(cufreq)

So, what I said earlier in reply to David Hoaglin is not correct. -cumul- is no easier than using ranks. 

P.S. Some may want to do this from first principles. I know Stata programmers who would rather not spend any time looking up a higher-level command's syntax when they can crunch it out in this way: 

gen work = -mpg
bysort work : gen cufreq = _N if _n == 1 
replace cufreq = sum(cufreq) 

Nick 
[email protected] 


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 12 March 2012 15:30
To: '[email protected]'
Subject: RE: st: Inverse Cummulative Variable

Here's a way of doing it with ranks: 

. sysuse auto, clear 
(1978 Automobile Data)

. gen work = -mpg

. egen work2 = rank(work), unique

. egen cufreq = max(work2), by(work)

. tabdisp mpg, c(cufreq)

----------------------
Mileage   |
(mpg)     |     cufreq
----------+-----------
       12 |         74
       14 |         72
       15 |         66
       16 |         64
       17 |         60
       18 |         56
       19 |         47
       20 |         39
       21 |         36
       22 |         31
       23 |         26
       24 |         23
       25 |         19
       26 |         14
       28 |         11
       29 |          8
       30 |          7
       31 |          5
       34 |          4
       35 |          3
       41 |          1
----------------------

If you were happy with a tabulation alone, -groups- (SSC) offers another way to do this. 

. groups mpg, show(rF) ge sep(0)

  +------------+
  | mpg   # >= |
  |------------|
  |  12     74 |
  |  14     72 |
  |  15     66 |
  |  16     64 |
  |  17     60 |
  |  18     56 |
  |  19     47 |
  |  20     39 |
  |  21     36 |
  |  22     31 |
  |  23     26 |
  |  24     23 |
  |  25     19 |
  |  26     14 |
  |  28     11 |
  |  29      8 |
  |  30      7 |
  |  31      5 |
  |  34      4 |
  |  35      3 |
  |  41      1 |
  +------------+


Nick 
[email protected] 


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: 12 March 2012 14:33
To: '[email protected]'
Subject: RE: st: Inverse Cummulative Variable

That's clearly right in principle. In practice, -cumul- is I think easier for what was asked for here.

Nick 
[email protected] 

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of David Hoaglin

For the number of firms that have at least as many employees (rather
than the percentage of firms), it should be possible to work with
ranks.

David Hoaglin

On Mon, Mar 12, 2012 at 4:04 AM, Nick Cox <[email protected]> wrote:
> Survivor, survival or reliability function I have often seen for the
> complementary probability. Is it used also for the corresponding
> complementary frequency, which is being sought here?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: the use of loop function in Stata
  - From: Rosie Chen <[email protected]>

References:
- st: Inverse Cummulative Variable
  - From: Arantxa Crespo Rodriguez <[email protected]>
- Re: st: Inverse Cummulative Variable
  - From: Nick Cox <[email protected]>
- Re: st: Inverse Cummulative Variable
  - From: Nick Cox <[email protected]>
- Re: st: Inverse Cummulative Variable
  - From: Maarten Buis <[email protected]>
- Re: st: Inverse Cummulative Variable
  - From: Nick Cox <[email protected]>
- Re: st: Inverse Cummulative Variable
  - From: David Hoaglin <[email protected]>
- RE: st: Inverse Cummulative Variable
  - From: Nick Cox <[email protected]>
- RE: st: Inverse Cummulative Variable
  - From: Nick Cox <[email protected]>

Prev by Date: st: Count models and fractional variables
Next by Date: st: Testing the validity of instruments when estimating a GMM model with Windmeijer corrected standard errors
Previous by thread: RE: st: Inverse Cummulative Variable
Next by thread: st: the use of loop function in Stata
Index(es):
- Date
- Thread