Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: generate new variable with frequencies


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: generate new variable with frequencies
Date   Tue, 4 Nov 2003 14:36:12 -0000

Actually, this is wrong. Sorry.

su typ_freq, meanonly

will almost always over-count.

count if !missing(type)
bysort type : gen typ_freq = _N / `r(N)' if !missing(type)

Nick
n.j.cox@durham.ac.uk

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
> Sent: 04 November 2003 12:56
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: generate new variable with frequencies
>
>
> The frequencies of -type- can be put
> in -typ_freq- by
>
> bysort type : gen typ_freq = _N
>
> If you want to scale to relative frequencies
> (sum 1) then you can
>
> su typ_freq, meanonly
> replace typ_freq = typ_freq / r(sum)
>
> and percents naturally can be obtained
> with a factor of 100.
>
> Some care will often be needed over
> missing values:
>
> bysort type : gen typ_freq = _N if !missing(type)
> su typ_freq, meanonly
> replace typ_freq = typ_freq / r(sum)
>
> Another way to do it:
>
> egen typ_freq = sum(1) if !missing(type), by(type)
>
> Nick
> n.j.cox@durham.ac.uk
>
> Matteo Foschi
> >
> > We have a little trouble with a, we think, easy task.
> > We want to generate a variable, which contains the relative
> > frequency of
> > another variable values.
> > We have a variable, say “type”, and want to built a new variable,
> > say “typ_freq”, which shows the relatively frequency of
> > each value of “type”.
> >
> > We have tried first with the tablepc ado-file:
> > tablepc type, generate (typ_freq)
> > Therefore we obtain only the relatively frequency of each
> > observation.
> >
> > We can obtain - in alternatively - the cumulated frequency
> > (variable freqcum)
> > with a little program:
> >
> > Generate freqcum =.
> > ..
> > sort typ_freq
> > by typ_freq: gen groups = 1 if _n ==1
> > replace groups = sum(groups)
> > ..
> > local K = groups[_N]
> > local i 1
> > while `i' <= `K' {
> >  replace freqcum = sum(typ_freq)
> >  local i = `i' + 1
> > }
> > ..
> > We are not able, however,  to recode freq_cum or typ_freq
> > in order to obtain
> > the relatively frequency of each value of “type”, as bottom shows:
> >
> > Obs	Type  obs_freq  freq_cum   		typ_freq
> > 1	1	1		1		2
> > 2	1	1		2		2
> > 3	2	1		3		3
> > 4	2	1		4		3
> > 5	2	1		5		3
> > 6	3	1		6		1
> > 7	4	1		7		2
> > 8	4	1		8		2
> > 9	5	1		9		1
> > 10	6	1		10		1
> >
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index