Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Binary time series
Nick Cox <email@example.com>
st: RE: Binary time series
Wed, 22 Sep 2010 15:45:04 +0100
Bob Yaffee did allude to some of the literature on irregular time series, and there's plenty more. For example, astronomers and others have a separate literature on getting spectra out of irregular series.
But if this were my problem I wouldn't go that way. I've a gut feeling that a simple regression-like model could work quite well for 30 data points but less well for any time series model you care to name. Time series models seem more data-hungry even when they work.
The researcher's question appears to hinge on looking at seasonality. Month as such I imagine to be quite arbitrary and artificial for tadpoles (unless lunar cycles are important, and if they are, you would be modelling them directly). Also, if you have a parameter per month, you are spreading the information pretty thinly.
I would work with Fourier series picking up dependence on time of year and then check for error structure. There is Stata-based literature at
SJ-6-4 st0116 . . . . Speaking Stata: In praise of trigonometric predictors
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/06 SJ 6(4):561--579 (no commands)
discusses the use of sine and cosine as predictors in
modeling periodic time series and other kinds of periodic
SJ-6-3 gr0025 . . . . . . . . . . . . Speaking Stata: Graphs for all seasons
(help cycleplot, sliceplot if installed) . . . . . . . . . N. J. Cox
Q3/06 SJ 6(3):397--419
illustrates producing graphs showing time-series seasonality
which may help in one way or another. Both papers are accessible via the Stata Journal.
You have a response that is a proportion. See for a review
SJ-8-2 st0147 . . . . . . . . . . . . . . Stata tip 63: Modeling proportions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F. Baum
Q2/08 SJ 8(2):299--303 (no commands)
tip on how to model a response variable that appears
as a proportion or fraction
In addition, converting time of year to a circular scale might help. There is a bundle of circular statistics programs in -circular- on SSC.
At home we have tadpoles sometimes in a small pond in our garden, but I have no data to share.
I don't know what Baum 2006 is. (But then Bob Yaffee didn't even give years in his "references"....)
I am seeking advice on analysis of a time series dataset in Stata. The same
site was visited irregularly 30 times over 3 years (median interval between
visits 35 days, range 18 to 68 days). At each visit, usually 5 tadpoles (but
sometimes 6 or 9) were sampled (numbers were limited because this is an
endangered species). Different tadpoles were sampled at each visit. Each
tadpole was tested and categorised as test positive or test negative.
Apparent prevalences were 1.00 at about half of the visits and 0.00 at about
25% of visits.
The researcher's question is whether prevalence varies by month (ie Jan,
Feb, Mar etc) or by season.
The features of this data that seem important are that the errors would be
expected to be serially correlation over time, the dependent variable is
binary, prevalences of 0 and 1 were common, the very small number of
tadpoles sampled at each visit, and these are not panel data (ie different
tadpoles were sampled at each visit).
I have done some exploratory modelling treating prevalence as a continuous
dependent variable (using -regress-) after declaring the data to be
time-series data (with sequential visit number rather than day number as the
time variable, using -tsset-). With a null model, tests for serial
correlation (Durbin-Watson test (-estat dwatson-), Durbin's alternative (h)
test (-estat durbinalt-),Breush-Godfrey test ( -estat bgodfrey,lag(6)-),
Portmaneau (Q) test (-wntestq-) and the autocorrelogram (-ac-)(all from Baum
2006) indicate serial correlation. In contrast, after fitting month as a
fixed effect, these tests do not support rejecting the null hypothesis that
no serial correlation exists. However treating prevalence (a proportion) as
a continuous dependent variable (using -regress-) is inappropriate.
* For searches and help try: