Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: tabulate all codes


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: tabulate all codes
Date   Sun, 3 Nov 2002 19:00:03 -0000

Kazi Ali Toufique

> I have a simple problem. I have a variable (say SEX) with
> codes 1 (say Female) and 2 (Male). When I tabulate for a dataset
that
> has all male then
> it does not show 1 (i. e. female) = 0 instead it shows 2 =
> 100%. How do I
> make a tabulate (or some other similar command) command
> show frequencies
> for all codes, not just only for the codes that exist in a
> particular database?

This is an example of a fairly common and more general
problem, which Kit Baum and I call the -tabcond- problem:
tabulate how many observations satisfy one or more
conditions, and show all counts explicitly, whether zero or more.
We wrote some programs without feeling that we had a good general
solution.

For this specific problem, however,
something cooked up ad hoc may help:

-tabcount- counts occurrences of specified values.

If the values are numeric, specify which you
want in the required -values()- option (numlists
are fine):

. tabcount rep78 , values(1/5)

----------------------------------------------
Repair    |
Record    |
1978      |      Freq.     Percent        Cum.
----------+-----------------------------------
        1 |          2        2.90        2.90
        2 |          8       11.59       14.49
        3 |         30       43.48       57.97
        4 |         18       26.09       84.06
        5 |         11       15.94      100.00
          |
    Total |         69      100.00
----------------------------------------------

. bysort foreign : tabcount rep78 , values(1/5)

______________________________________________________________________
_________
-> foreign = Domestic

----------------------------------------------
Repair    |
Record    |
1978      |      Freq.     Percent        Cum.
----------+-----------------------------------
        1 |          2        4.17        4.17
        2 |          8       16.67       20.83
        3 |         27       56.25       77.08
        4 |          9       18.75       95.83
        5 |          2        4.17      100.00
          |
    Total |         48      100.00
----------------------------------------------

______________________________________________________________________
_________
-> foreign = Foreign

----------------------------------------------
Repair    |
Record    |
1978      |      Freq.     Percent        Cum.
----------+-----------------------------------
        1 |          0        0.00        0.00
        2 |          0        0.00        0.00
        3 |          3       14.29       14.29
        4 |          9       42.86       57.14
        5 |          9       42.86      100.00
          |
    Total |         21      100.00
----------------------------------------------

If the values are string, again you need
to specify them. For this, -levels- from
SSC can save some typing. Here is
a silly example:

. egen Make = ends(make), head

. levels Make
`"AMC"' `"Audi"' `"BMW"' `"Buick"' `"Cad."' `"Chev."' `"Datsun"'
`"Dodge"' `"Fiat"' `"Ford"' `"Honda"' `"Linc
> ."' `"Mazda"' `"Merc."' `"Olds"' `"Peugeot"' `"Plym."' `"Pont."'
`"Renault"' `"Subaru"' `"Toyota"' `"VW"' `
> "Volvo"'

. bysort rep78 : tabcount Make , values(`r(levels)')

______________________________________________________________________
_________
-> rep78 = 1

----------------------------------------------
     Make |      Freq.     Percent        Cum.
----------+-----------------------------------
      AMC |          0        0.00        0.00
     Audi |          0        0.00        0.00
      BMW |          0        0.00        0.00
    Buick |          0        0.00        0.00
     Cad. |          0        0.00        0.00
    Chev. |          0        0.00        0.00
   Datsun |          0        0.00        0.00
    Dodge |          0        0.00        0.00
     Fiat |          0        0.00        0.00
     Ford |          0        0.00        0.00
    Honda |          0        0.00        0.00
    Linc. |          0        0.00        0.00
    Mazda |          0        0.00        0.00
    Merc. |          0        0.00        0.00
     Olds |          1       50.00       50.00
  Peugeot |          0        0.00       50.00
    Plym. |          0        0.00       50.00
    Pont. |          1       50.00      100.00
  Renault |          0        0.00      100.00
   Subaru |          0        0.00      100.00
   Toyota |          0        0.00      100.00
       VW |          0        0.00      100.00
    Volvo |          0        0.00      100.00
          |
    Total |          2      100.00
----------------------------------------------

______________________________________________________________________
_________
-> rep78 = 2

< snip >

Here is the code. I'll ask Kit to
put it up on SSC.

---------------------- begin tabcount.ado
*! NJC 1.0.0 3 Nov 2002
program def tabcount, byable(recall)
	version 7
	syntax varname [if] [in] , values(str asis)

	marksample touse, strok
	tempvar Values Freq SFreq Percent Cum
	qui gen long `Freq' = .
	qui gen str1 `SFreq' = ""

	capture numlist "`values'"

	* string values
	qui if _rc { /* */
		gen str1 `Values' = ""
		tokenize `"`values'"'
		local nvals : word count `values'

		forval i = 1 / `nvals' { /* */
			replace `Values' = `"``i''"' in `i'
			count if `touse' & `varlist' == `"``i''"'
			replace `Freq' = r(N) in `i'
		}
	}

	* numeric values
	else qui { /* */
		gen `Values' = .
		tokenize `r(numlist)'
		local nvals : word count `r(numlist)'
		forval i = 1 / `nvals' { /* */
			replace `Values' = ``i'' in `i'
			count if `touse' & `varlist' == ``i''
			replace `Freq' = r(N) in `i'
		}
	}

	qui { /* */
		su `Freq', meanonly
		gen `Percent' = 100 * `Freq' / r(sum)
		gen `Cum' = sum(`Percent') in 1 / `nvals'
		local nvalsp1 = `nvals' + 1
		replace `Freq' = r(sum) in `nvalsp1'
		replace `Percent' = 100 in `nvalsp1'
		replace `SFreq' = string(`Freq')
	}

	_crcslbl `Values' `varlist'
	local vallbl : value label `varlist'
	if "`vallbl'" != "" { /* */
		label val `Values' `vallbl'
	}

	label var `SFreq' "Freq."
	label var `Percent' "Percent"
	label var `Cum' "Cum."

	tabdisp `Values' in 1 / `nvalsp1', /*
	*/ cell(`SFreq' `Percent' `Cum') format(%3.2f) total

end
----------------------- end tabcount.ado



Nick
n.j.cox@durham.ac.uk


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index