Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: help to find maximum drop in a variable

From   David Kantor <>
Subject   Re: st: help to find maximum drop in a variable
Date   Tue, 02 Mar 2004 09:43:03 -0500

At 07:59 AM 3/2/2004 -0500, Chris Wallace wrote:
My colleague has passed to me a query to which I can't find a
straightforward answer.  I am hoping some of the experts in this group
could help?

She has hourly barometric pressure data for many days.  So the dataset
contains the variables

year month day hour pressure

(with 24 hourly observations per day).  During any day the pressure may
rise and fall.  She wishes to generate a new variable containing the
maximum fall (constant within year-month-day groups).  That is, for a
fictitious series of pressure readings

90 100 90 80 70 80 65 70 90 80 70

the maximum fall is 100-70=30.  This is not simply the maximum - minimum
(100-65), since the pressure rises for one interval in that period

She also makes the points:
- there may be more than one drop per day... i want the largest one
- if pressure has not dropped at all through out the day, the value
returned should be zero or negative

Any suggestions?

Many thanks, Chris.

I started with the idea that I wouldn't write the whole solution, but I think I did, as it is a very interesting problem. Here is how I would attack it. This is untested.

isid year month day hour, sort

by year month day: gen byte s1 = sign(pressure - pressure[_n-1])
// note that s1 is . for the first observation in each group (day); else it is -1, 0, or +1.

by year month day: replace s1 = s1[_n-1] if s1==0 & _n>1
// -- the & _n>1 may be unnecessary

/* Now we have s1 = +1 or -1 indicating rising or dropping pressures. It is . for one or more observations at the beginning of each group (day). It is not 0; 0 has been changed to have the
same value as the preceding nonzero change. That is, if you level off after a drop, you are still
in a drop, and similarly for a rise.
by year month day: gen int /* maybe byte */ runno = 0 if _n==1
by year month day: replace runno = cond(s1==s1[_n-1], runno[_n-1], runno[_n-1]+1) if _n>1

/* runno is the "run number" -- counting up, within each group, the runs of rising or falling pressures.

sort year month day runno hour
by year month day runno: gen int extreme = pressure[_N]

/* extreme is the final value within each run -- the max for a rise, the min for a fall.
Actually, we now need to reduce to a set of runs.
by year month day runno: keep if _n == _N
// (can drop pressure s1)

by year month day: gen change = extreme - extreme[_n-1]
// -- change is . for first run of the day.

egen int maxfall = min(change), by(year month day)
egen int maxrise = max(change), by(year month day)

/* Note that maxfall is a negative -- a minimum of negatives; reverse its sign if desired. */

I hope this helps and that it works. I'd like to know if it works.
-- David Kantor

Institute for Policy Studies
Johns Hopkins University

* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index