Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Robert Picard <picard@netbox.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: How to define shortest possible period with 95% of observations |

Date |
Mon, 10 May 2010 16:28:43 -0400 |

Here is how I would approach this problem. I would do each year separately; it could be done all at once but it would complicate the code unnecessarily. If the fire data is one observation per fire, I would -collapse- it to one observation per day. Each observation would contain the number of fires that day. The following code will identify the first instance of the shortest run of days that includes 95% of fires for the year. Note that the following code will work, even if there are days without fires (and thus no observation for that day). *--------------------------- begin example ----------------------- version 11 * daily fire counts; with some days without fires clear all set seed 123 set obs 365 gen day = _n drop if uniform() < .1 gen nobs = _n gen nfires = round(uniform() * 10) * the target is a continuous run that includes 95% of all fires sum nfires, meanonly scalar target = .95 * r(sum) dis target scalar shortlen = . gen arun = . gen bestrun = . * at each pass, create a run that starts at nobs == `i' * and identify the nobs where the number of fires >= 95% local more 1 local i 0 while `more' { local i = `i' + 1 qui replace arun = sum(nfires * (nobs>=`i')) sum nobs if arun >= target, meanonly if r(N) == 0 local more 0 else if (day[r(min)] - day[`i']) < shortlen { scalar shortlen = day[r(min)] - day[`i'] qui replace bestrun = arun qui replace bestrun = . if nobs > r(min) | nobs < `i' } } *--------------------- end example -------------------------- Hope this help, Robert On Mon, May 10, 2010 at 6:19 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > I don't think any trick is possible unless you know in advance the > precise distribution, e.g. that it is Gaussian, or exponential, or > whatever, which here is not the case. > > So, you need to look at all the possibilities from the interval starting > at the minimum to the interval starting at the 5% point of the fire > number distribution in each year. > > However, this may all be achievable using -shorth- (SSC). Look at the > -proportion()- option, but you would need to -expand- first to get a > separate observation for each fire. If that's not practicable, look > inside the code of -shorth- to get ideas on how to proceed. Note that no > looping is necessary: the whole problem will reduce to use of -by:- and > subscripts. > > Nick > n.j.cox@durham.ac.uk > > Daniel Mueller > > I have a strongly unbalanced panel with 100,000 observations (=fire > occurrences per day) that contain between none (no fire) and 3,000 fires > > per day for 8 years. The fire events peak in March and April with about > 85-90% of the yearly total. > > My question is how I can define the shortest possible continuous period > of days for each year that contains 95% of all yearly fires. The length > and width of the periods may slightly differ across the years due to > climate and other parameters. > > I am sure there is a neat trick in Stata for this, yet I have not > spotted it. Any suggestions would be appreciated. > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: How to define shortest possible period with 95% of observations***From:*Daniel Mueller <mueller@iamo.de>

**References**:**st: How to define shortest possible period with 95% of observations***From:*Daniel Mueller <mueller@iamo.de>

**st: RE: How to define shortest possible period with 95% of observations***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: question on gllamm for discrete latent variable with one factor structure** - Next by Date:
**Re: st: question on gllamm for discrete latent variable with one factor structure** - Previous by thread:
**st: RE: How to define shortest possible period with 95% of observations** - Next by thread:
**Re: st: RE: How to define shortest possible period with 95% of observations** - Index(es):