Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: Inverse Cummulative Variable

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: Inverse Cummulative Variable Date Mon, 12 Mar 2012 18:07:11 +0000

```This can be shortened. Let me recap for the very small number of people still reading.

The original question can be rephrased: I want complementary cumulative frequencies, which are #(>= value).

Using -mpg-

. sysuse auto, clear

gen work = -mpg
cumul work , gen(cufreq) freq equal

egen work = rank(-mpg), unique
egen cufreq = max(work), by(mpg)

The shortening is possible because the -rank()- function of -egen- can take expressions.

After (1) or (2), a convenient tabulation is

tabdisp mpg, c(cufreq)

So, what I said earlier in reply to David Hoaglin is not correct. -cumul- is no easier than using ranks.

P.S. Some may want to do this from first principles. I know Stata programmers who would rather not spend any time looking up a higher-level command's syntax when they can crunch it out in this way:

gen work = -mpg
bysort work : gen cufreq = _N if _n == 1
replace cufreq = sum(cufreq)

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 12 March 2012 15:30
To: 'statalist@hsphsun2.harvard.edu'
Subject: RE: st: Inverse Cummulative Variable

Here's a way of doing it with ranks:

. sysuse auto, clear
(1978 Automobile Data)

. gen work = -mpg

. egen work2 = rank(work), unique

. egen cufreq = max(work2), by(work)

. tabdisp mpg, c(cufreq)

----------------------
Mileage   |
(mpg)     |     cufreq
----------+-----------
12 |         74
14 |         72
15 |         66
16 |         64
17 |         60
18 |         56
19 |         47
20 |         39
21 |         36
22 |         31
23 |         26
24 |         23
25 |         19
26 |         14
28 |         11
29 |          8
30 |          7
31 |          5
34 |          4
35 |          3
41 |          1
----------------------

If you were happy with a tabulation alone, -groups- (SSC) offers another way to do this.

. groups mpg, show(rF) ge sep(0)

+------------+
| mpg   # >= |
|------------|
|  12     74 |
|  14     72 |
|  15     66 |
|  16     64 |
|  17     60 |
|  18     56 |
|  19     47 |
|  20     39 |
|  21     36 |
|  22     31 |
|  23     26 |
|  24     23 |
|  25     19 |
|  26     14 |
|  28     11 |
|  29      8 |
|  30      7 |
|  31      5 |
|  34      4 |
|  35      3 |
|  41      1 |
+------------+

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: 12 March 2012 14:33
To: 'statalist@hsphsun2.harvard.edu'
Subject: RE: st: Inverse Cummulative Variable

That's clearly right in principle. In practice, -cumul- is I think easier for what was asked for here.

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of David Hoaglin

For the number of firms that have at least as many employees (rather
than the percentage of firms), it should be possible to work with
ranks.

David Hoaglin

On Mon, Mar 12, 2012 at 4:04 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Survivor, survival or reliability function I have often seen for the
> complementary probability. Is it used also for the corresponding
> complementary frequency, which is being sought here?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```