Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: estimation of prevalence models with GLM

From   Steve <[email protected]>
To   [email protected]
Subject   st: estimation of prevalence models with GLM
Date   Fri, 10 Oct 2003 07:24:21 -0700

Dear StataFolk,

We are analyzing a short (36 months) time series data set in which:

1. the dependent variable is prevalence of a contaminant in foodstuffs over
time (proportion of positive detections for all examinations each month,)
and there is no significant autocorrelation evident in the series.
2. the independent variables are a mix of continuous and dummy variables
that also vary over the time series.

Following the Stata list thread "proportion as a dependent variable" in
July, 2003 in which Roger Newson made some recommendations (July 14), we are
using -glm- with [family(binomial) link(identity) robust] to model the data.

Two questions:

1. How does one interpret the coefficients (one of the dummy variables has a
significant coefficient over 1.0)?
2. As the diagnostics with "deviance" residuals appear very strange (they
are basically clustered in three strata, with small variation within each
strata), is this an indication of a poorly fit model or should we be using
Pearson or other residuals?

This, of course, opens the wider question of how to perform diagnostics when
forcing continuous data such as proportions into a binomial family model.

Would the experts be able to offer advice?

We're using Stata SE 8.1.

Many thanks,
Steve Rothenberg

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index