Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Count of unique cases by group


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Count of unique cases by group
Date   Fri, 6 Jan 2012 15:46:45 +0000

By "unique" you evidently mean "distinct". (For more on that
distinction, see the 2008 paper below or the manual entry for
-duplicates-.)

Variants of this question have been discussed in

FAQ     . . . . . . . . . . . . . .  Calculating the number of distinct values
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        9/06    How do I calculate the number of distinct
                values seen so far?
                http://www.stata.com/support/faqs/data/distinctvalues.html

FAQ     . . . . . . . . .  Counting distinct strings across a set of variables
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        7/04    How do I count the number of distinct strings
                across a set of variables?
                http://www.stata.com/support/faqs/data/distinctstrings.html

FAQ     . . . . . . . . . . . . . . . . . . .  Number of distinct observations
        . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and G. Longton
        4/02    How do I compute the number of distinct observations?
                http://www.stata.com/support/faqs/data/distinct.html

SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct observations
        (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
        Q4/08   SJ 8(4):557--568
        shows how to answer questions about distinct observations
        from first principles; provides a convenience command

This may help

sysuse auto, clear
egen tag = tag(rep78 mpg)
egen distinct = total(tag), by(rep78)
bysort rep78 : gen freq = _N
gen fraction = distinct / freq
tabdisp rep78, c(distinct freq fraction)

----------------------------------------------
Repair    |
Record    |
1978      |   distinct        freq    fraction
----------+-----------------------------------
        1 |          2           2           1
        2 |          6           8         .75
        3 |         15          30          .5
        4 |         11          18    .6111111
        5 |          8          11    .7272727
        . |          0           5           0
----------------------------------------------

Nick

On Fri, Jan 6, 2012 at 3:30 PM, Ben Hoen <[email protected]> wrote:

> I want to count the number of unique cases within a group to generate a
> summary table.  .collapse gets me part of the way there, but not all of the
> way.
>
> sysuse auto, clear
> g make2=word(make,1)
> rename rep78 area
> keep make2 area price
> keep in 1/20
> drop if area==.
> sort area make2
> order area make2
> list
>
> Using these records, I would like to produce a table with a statistic
> representing the count of the unique groups of make2 in each area, divided
> by the count of make2 in each area.
>
> i.e. pctcount=count of groups of make2 / count of make2
>
> So the output table would look like this ideally (minus the "underlying
> calculation" column):
>
> area    pctcount        <underlying calculation>
> 2       0.6666          2/3
> 3       0.25            4/12
> 4       1               2/2
> 5       1               1/1

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index