Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: help to find maximum drop in a variable


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: help to find maximum drop in a variable
Date   Fri, 5 Mar 2004 12:55:59 -0000

Thanks to Chris to very full feedback. 

The problem was 

Data: hourly barometric pressure data for many days

Variables: year month day hour pressure 
 
So during any day the pressure rises and falls.  

Aim: generate a new variable containing the maximum fall 
within a day, from a peak to a trough. This is not the
same as the daily range. 

This was my code, assuming installation of -tsspell-
from SSC, 

egen panel = group(year month day) 
tsset panel hour
tsspell , cond(F.press < press) 
replace _spell = L._spell if L._end  
egen max = max(pressure) if _spell, by(panel _spell) 
egen min = min(pressure) if _spell, by(panel _spell) 
egen range = max(max - min), by(panel) 

where you need to install -tsspell- from SSC. 

The corrected code is 

egen panel = group(year month day) 
tsset panel hour
tsspell , cond(F.press < press) 
replace _spell = L._spell if L._end == 1  
egen max = max(pressure) if _spell, by(panel _spell) 
egen min = min(pressure) if _spell, by(panel _spell) 
egen range = max(max - min), by(panel) 

The bug was in assuming that because _end is generated
with values 0 and 1, you can use the short-cut 

if L._end 

for 

if L._end == 1

But the short-cut doesn't work, as L._end will be 
missing for the start of each panel, and so non-zero. 

Nick 
n.j.cox@durham.ac.uk 

chris wallace
> 
> Many thanks to David, Nikos, Scott and Nick who all responded 
> to my post
> about a colleague's request to find the maximum drop in a variable
> during a given time period.
> 
> Reading the answers was very instructive for me, especially as I tried
> and failed to solve the query before I posted.
> 
> Unfortunately, none worked "out of the box", but were close 
> enough that
> my colleague had now been able to work out her own solution.
> 
> In summary: 
> 
> - Nikos appeared to work fine, but failed when there were two
> consecutive values that were equal.  This we fixed by replacing 
> by day:gen tri=-1*(change<0)+(change>0)
> with
> by day:gen tri=-1*(change<=0)+(change>0)
> 
> - Nick's was good on brevity, and neat enough not to take 
> much figuring
> out.  But it failed when the drop began with the first 
> observation in a
> day.  This my colleague says can be fixed by replacing
> 
> egen max = max(pressure) if _spell, by(panel _spell) 
> 
> with
> 
> gen max = pressure if _seq ==1
> 
> but I can't see why it wouldn't fail now if the maximum 
> pressure wasn't
> the first observation of the day...
> 
> - David's answer we liked lots, particularly for all the helpful
> comments!  It appears to produce the right answer every time, but
> rounded to a whole number (although in my sample data in the email,
> pressure was all integer, in the real data it is float).  
> This we fixed
> by dropping the "int" from
> 
> by year month day runno: gen int extreme = pressure[_N]
> 
> - Scott's I'm afraid we got a bit lost on. It doesn't always 
> work right,
> but we can't figure out why it fails.  (Sorry).
> 
> Thanks again to all of you for taking the time to help on this.  And
> apologies for listing the failures above like they were 
> problems in your
> code.  In fact, all the code worked fine for the simplified 
> data I sent
> in my original email, and the task I set was a little unfair 
> - expecting
> you to write code in anticipation of idiosyncrasies in a real 
> dataset I
> didn't show you!  

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index