# st: RE: Proportions

 From "Nick Cox" <[email protected]> To <[email protected]> Subject st: RE: Proportions Date Tue, 25 Jan 2005 11:04:05 -0000

```As I understand it, your structure is

day year stock return

with one value of -return- for each -day- and
-stock-. -day- is naturally nested within -year-.

If so, the number of days with -return- less than 5 is

. bysort stock year : gen low_days = sum(return < 5)
. by stock year : replace low_days = low_days[_N]

and the total number of days for each combination
is

. by stock year : gen no_days = _N

and so

. gen prop_low_days = low_days / no_days

except that we should be able to telescope this to

. bysort stock year :
gen prop_low_days = sum(return < 5)
. by stock year :
replace prop_low_days = prop_low_days[_N] / _N

Note my continuation lines. Also, I cut down on
the number of variables, and the name doesn't
match the contents until I'm done.

If there are no missing values of -return-
we would need to be more circumspect.

. bysort stock year : gen low_days = sum(return < 5)
. by stock year : gen prop_low_days = sum(return < .)
. by stock year :
replace prop_low_days = low_days[_N] / prop_low_days[_N]

Also, if you wanted to count proportions of high
values of -return- you would need to
watch that (e.g.) -sum(return > 10)- will catch
any missings as well.

What about -egen-? Clearly you can do it that way.
Sometimes, indeed often, drilling down one level
to get the elementary building blocks is in
fact easier. I know one extremely advanced
user of Stata who hates -egen-, I think because
by the time he has looked up the syntax he
could have ground it all out from first
principles with some -by:- footwork. But he
is very fast with Stata, having used it
since the beginning.

gen lo = 0
replace lo = 1 if ret < -5
egen temp = count(lo), by(stock year)
egen temp2 = sum(lo), by(stock year)

could be done this way:

egen temp = sum(1), by(stock year)
egen temp2 = count(ret < 5), by(stock year)

(I don't understand why you have -5.)

There was a tutorial on -by:- in Stata Journal
2(1) 2002.

Nick
[email protected]

Yvonne Capstick

> I have a hopefully simple question on calculating proportions.
>
> I have daily returns (ret) for different stocks (stock) and I
> would like to
> calculate the proportion of days for which a firm's daily
> stock return was
> below 5% over the last 3 calendar years.
>
> If all I needed was the proportion of trading days for which
> the return was
> below 5% over the last 1 calendar year, I could calculate this by the
> following long-winded method:
>
> gen lo = 0
> replace lo = 1 if ret < -5
> egen temp = count(lo), by (stock year)
> egen temp2 = sum(lo), by (stock year)
> gen prop = temp2/temp
> gen temp3 = prop[_n-1] if month == 1 & month[_n-1] == 12 & year ==
> year[_n-1]+1
> egen lastprop = sum(temp3), by (stock year)
>
> a) There must be a faster way of doing the above - I tried
> something like
> egen prop = count(lo)/sum(lo), by (stock year) but it said
> 'varlist not
> b) How do I modify the above to calculate the proportion of