Nick,

Thanks very much for that reply. I've now been able to calculate the proportion of days with returns below 5 efficiently, via your method.

My structure is now:

day year stock return prop_low_days

where prop_low_days is constructed via your method, and therefore gives the proportion of days in that year where the stock returned less than 5%. It is thus the same figure for every day in a particular year for a particular stock.

Now I would like to construct a very similar measure to the above, but giving the proportion of days in the past three years where the stock returned less than 5%. So for 24 Feb 02 I would like the newvar to give the proportion of days in 00-02 where the stock returned < 5%; for 7 Jul 01 I would like newvar to give the proportion of days in 99-01 where the stock returned < 5%.

I am not sure of a simple way to do this, because the "by year" structure doesn't work easily. I'm not sure how to "pick out" the 00, 01 and 02 values of prop_low_days to take a simple average of these (and I don't think a simple average would work because of the different number of days in each year).

Thanks,

Yvonne

_________________________________________________________________From: "Nick Cox" <[email protected]> Reply-To: [email protected] To: <[email protected]> Subject: st: RE: Proportions Date: Tue, 25 Jan 2005 11:04:05 -0000 As I understand it, your structure is day year stock return with one value of -return- for each -day- and -stock-. -day- is naturally nested within -year-. If so, the number of days with -return- less than 5 is . bysort stock year : gen low_days = sum(return < 5) . by stock year : replace low_days = low_days[_N] and the total number of days for each combination is . by stock year : gen no_days = _N and so . gen prop_low_days = low_days / no_days except that we should be able to telescope this to . bysort stock year : gen prop_low_days = sum(return < 5) . by stock year : replace prop_low_days = prop_low_days[_N] / _N Note my continuation lines. Also, I cut down on the number of variables, and the name doesn't match the contents until I'm done. If there are no missing values of -return- we would need to be more circumspect. . bysort stock year : gen low_days = sum(return < 5) . by stock year : gen prop_low_days = sum(return < .) . by stock year : replace prop_low_days = low_days[_N] / prop_low_days[_N] Also, if you wanted to count proportions of high values of -return- you would need to watch that (e.g.) -sum(return > 10)- will catch any missings as well. What about -egen-? Clearly you can do it that way. Sometimes, indeed often, drilling down one level to get the elementary building blocks is in fact easier. I know one extremely advanced user of Stata who hates -egen-, I think because by the time he has looked up the syntax he could have ground it all out from first principles with some -by:- footwork. But he is very fast with Stata, having used it since the beginning. Note that your gen lo = 0 replace lo = 1 if ret < -5 egen temp = count(lo), by(stock year) egen temp2 = sum(lo), by(stock year) could be done this way: egen temp = sum(1), by(stock year) egen temp2 = count(ret < 5), by(stock year) (I don't understand why you have -5.) There was a tutorial on -by:- in Stata Journal 2(1) 2002. Nick [email protected] Yvonne Capstick > I have a hopefully simple question on calculating proportions. > > I have daily returns (ret) for different stocks (stock) and I > would like to > calculate the proportion of days for which a firm's daily > stock return was > below 5% over the last 3 calendar years. > > If all I needed was the proportion of trading days for which > the return was > below 5% over the last 1 calendar year, I could calculate this by the > following long-winded method: > > gen lo = 0 > replace lo = 1 if ret < -5 > egen temp = count(lo), by (stock year) > egen temp2 = sum(lo), by (stock year) > gen prop = temp2/temp > gen temp3 = prop[_n-1] if month == 1 & month[_n-1] == 12 & year == > year[_n-1]+1 > egen lastprop = sum(temp3), by (stock year) > > a) There must be a faster way of doing the above - I tried > something like > egen prop = count(lo)/sum(lo), by (stock year) but it said > 'varlist not > allowed". Please could you advise me of any faster way? > b) How do I modify the above to calculate the proportion of > trading days > where the return was < 5% over the last 3 calendar years? > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

