Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: Binary time series

From   "John Morton" <>
To   <>
Subject   st: RE: RE: Binary time series
Date   Thu, 30 Sep 2010 09:25:29 +1000

Many thanks to Robert (Yaffee) and Nick (Cox) for their excellent
suggestions on approaches to analysis of the binary time series data I
described. I now have plenty to look into and think about.

Nick, 'Baum 2006' is Baum CF (2006) An Introduction to Modern Econometrics
Using Stata, Stata Press, College Station. Apologies for not including these
details in my original posting.


Dr John Morton BVSc (Hons) PhD MACVSc (Veterinary Epidemiology)
Veterinary Epidemiological Consultant
Jemora Pty Ltd
PO Box 2277
Geelong 3220
Victoria Australia
Ph:  +61 (0)3 52 982 082
Mob: 0407 092 558

-----Original Message-----
[] On Behalf Of Nick Cox
Sent: Thursday, 23 September 2010 12:45 AM
To: ''
Subject: st: RE: Binary time series

Bob Yaffee did allude to some of the literature on irregular time series,
and there's plenty more. For example, astronomers and others have a separate
literature on getting spectra out of irregular series. 

But if this were my problem I wouldn't go that way. I've a gut feeling that
a simple regression-like model could work quite well for 30 data points but
less well for any time series model you care to name. Time series models
seem more data-hungry even when they work. 

The researcher's question appears to hinge on looking at seasonality. Month
as such I imagine to be quite arbitrary and artificial for tadpoles (unless
lunar cycles are important, and if they are, you would be modelling them
directly). Also, if you have a parameter per month, you are spreading the
information pretty thinly. 

I would work with Fourier series picking up dependence on time of year and
then check for error structure. There is Stata-based literature at 

SJ-6-4  st0116  . . . .  Speaking Stata: In praise of trigonometric
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J.
        Q4/06   SJ 6(4):561--579                                 (no
        discusses the use of sine and cosine as predictors in
        modeling periodic time series and other kinds of periodic

SJ-6-3  gr0025  . . . . . . . . . . . . Speaking Stata: Graphs for all
        (help cycleplot, sliceplot if installed)  . . . . . . . . .  N. J.
        Q3/06   SJ 6(3):397--419
        illustrates producing graphs showing time-series seasonality

which may help in one way or another. Both papers are accessible via the
Stata Journal. 

You have a response that is a proportion. See for a review  

SJ-8-2  st0147  . . . . . . . . . . . . . . Stata tip 63: Modeling
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F.
        Q2/08   SJ 8(2):299--303                                 (no
        tip on how to model a response variable that appears
        as a proportion or fraction

In addition, converting time of year to a circular scale might help. There
is a bundle of circular statistics programs in -circular- on SSC. 

At home we have tadpoles sometimes in a small pond in our garden, but I have
no data to share. 

I don't know what Baum 2006 is. (But then Bob Yaffee didn't even give years
in his "references"....) 


John Morton

I am seeking advice on analysis of a time series dataset in Stata. The same
site was visited irregularly 30 times over 3 years (median interval between
visits 35 days, range 18 to 68 days). At each visit, usually 5 tadpoles (but
sometimes 6 or 9) were sampled (numbers were limited because this is an
endangered species). Different tadpoles were sampled at each visit. Each
tadpole was tested and categorised as test positive or test negative.
Apparent prevalences were 1.00 at about half of the visits and 0.00 at about
25% of visits. 

The researcher's question is whether prevalence varies by month (ie Jan,
Feb, Mar etc) or by season. 

The features of this data that seem important are that the errors would be
expected to be serially correlation over time, the dependent variable is
binary, prevalences of 0 and 1 were common, the very small number of
tadpoles sampled at each visit, and these are not panel data (ie different
tadpoles were sampled at each visit).

I have done some exploratory modelling treating prevalence as a continuous
dependent variable (using -regress-) after declaring the data to be
time-series data (with sequential visit number rather than day number as the
time variable, using -tsset-). With a null model, tests for serial
correlation (Durbin-Watson test (-estat dwatson-), Durbin's alternative (h)
test (-estat durbinalt-),Breush-Godfrey test ( -estat bgodfrey,lag(6)-),
Portmaneau (Q) test (-wntestq-) and the autocorrelogram (-ac-)(all from Baum
2006) indicate serial correlation. In contrast, after fitting month as a
fixed effect, these tests do not support rejecting the null hypothesis that
no serial correlation exists. However treating prevalence (a proportion) as
a continuous dependent variable (using -regress-) is inappropriate. 

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index