2009 Australian and New Zealand Stata Users Group meeting: Abstracts
Thursday, November 5, 2009
The impact of water supply and sanitation interventions on child health: Evidence from Demographic and Health Surveys
International Initiative for Impact Evaluation (3ie)
In this presentation, I examine the impacts on child health, using diarrhea as the
health outcome, (among children living in households) with access to
different types of water and sanitation facilities, and from socioeconomic
and child specific factors. Using multiyear cross-sectional health DHS
data, I employ the quasi-experimental estimators (matching) to match
children belonging to different treatment groups, defined by water types and
sanitation facilities, with children in a control group. Quantile
regression models are used to benchmark results and to check for their
robustness. The empirical framework yields strong support that access to
improved sanitation has had a substantial impact on reducing the (predicted)
diarrhea outcomes. This is especially true among very young children defined
as those below 24 months of age whose rates of diarrhea have shown the largest
declines between 2001 and 2006. These estimates serve as an input into
cost-effectiveness analysis that compares the provision of increased access
to sanitation with other public health interventions in developing nations
(especially sub-Saharan Africa and South Asia) to underline its importance
in achieving Millennium Development Goals (MDGs).
The use of a quantile regression model in an occupational health and safety study
University of Newcastle
Texas A&M University
The ISO-7029: statistical distribution of hearing thresholds as a function
of age provides by gender the expected median value of hearing thresholds
relative to the median threshold at the age of 18 years and the statistical
distribution above and below the median value for the range of audiometric
frequencies from 125 Hz to 8000 Hz for populations of otologically normal
persons of given age between 18 and 70 years. Comparing hearing thresholds
of a study population has been problematic. Published studies have used a
series of t
tests as the analysis method. In this paper, we aim to
present the hearing threshold data collected for the SHOAMP study to compare
the hearing thresholds with the reference population using quantile
Using data collected as part of the SHOAMP study, hearing thresholds were assessed
in both ears of 614 exposed personnel, 513 technical tradesmen, and 403 nontechnical
tradesmen using pure-tone audiometry (air conduction) at the frequencies of
0.1, 1, 2, 3, 4, 6, and 8 kHz. The results were compared with the
otologically normal population using a quantile regression model,
controlling for possible confounding variables.
Model estimates median hearing thresholds significantly lower than normal.
The extent of the hearing loss is substantial in that a 95% confidence band
for the median lies below the 30th percentile of the normal population for
most frequencies and ages. The largest loss occurs at 6 and 8 kHz for those
under 30 years of age.
Effects of lack of independence in meta-epidemiology
University of Otago
Meta-epidemiology is the use of the characteristics of individual reports of
randomized trials to examine the effects on meta-analyses. Traditionally
only one meta-analysis from a systematic review is included in these
studies, due to a fear of what would happen because of the lack of
independence if the same study was included in more than one meta-analysis.
We have some data on 64 meta-analyses but these come from only 18 systematic
reviews. Papers submitted from this data have been heavily criticized
because of the feared effects of this lack of independence. One suggestion
for a sensitivity analysis is to randomly select one meta-analysis from each
systematic review. As an extension to that I chose to bootstrap the results
of interest to examine the effects of the lack of independence. Stata makes
these analyses trivial. This talk will present the results of two such
sensitivity analyses and show that the lack of independence appears to have
little effect on the interpretation of the results.
Meta-analysis in animal health and reproduction: Methods and applications using Stata
Meta-analysis is a rapidly expanding area of research that has been
relatively underutilized in animal and veterinary science. It is a
quantitative, formal, epidemiological study design used to systematically
assess previous research studies to derive conclusions about that
body of research. Outcomes from a meta-analysis may include a more precise
estimate of the effect of treatment or risk factor for disease, or other
outcomes, than any individual study contributing to the pooled analysis.
The examination of variability or heterogeneity in study results is also a
critical outcome. Examples where meta-analyses have been repeated in animal
science or veterinary medicine show good consistency in estimates of effect.
Rigorously conducted meta- analyses are useful tools to improve animal
well-being and productivity. The need to integrate findings from many
studies ensures that meta-analytic research is desirable and the large body
of research now generated makes the conduct of this research feasible.
Many of the statistical methods to conduct meta-analysis are widely used. In
this presentation, we will demonstrate how Stata can provide a comprehensive
suite of programs that can be used in meta-analysis. Some detail on the
common statistical methods used, such as metan
presented and examples of when these have been used in studies using cattle
are provided. The post-hoc methods used to evaluate heterogeneity and
publication bias (metabias
>), which include the I2
statistic, L’Abb plots, Galbraith plots, Rosenthal’s N, and
influential study analysis are exclusively used in meta-analysis.
Generating RTF files in Stata to create tables for inclusion in Word documents
University of Western Australia
When an analysis is completed, the final results need to be tabulated for
inclusion into a manuscript, and this can be a very time-consuming task.
Hence it is not surprising that in the past few years, a variety of
user-written ador- files have appeared that generate tables of results for
inclusion in Word, Excel, or LaTeX. While these may provide a solution for
many users, they can be difficult to use at first. There is no standard
format for a table of results, and consequently, any ado-file must provide
many options to give the user the ability to control the arrangement of
results in a table.
Perhaps an easier approach would be to simply generate the required table
directly from within a program. Because most users will be writing manuscripts
in Word, a table created as an RTF file can be included into a Word document
quite easily, and generating such a table from within a Stata program
requires knowledge of only a few RTF commands. While this is a “brute
force” method for generating a table, it is particularly useful if many
tables of the same layout have to be generated (in a thesis, for example),
or where an annual report has to be produced and the number and layout of
tables remains essentially the same from year to year.
In this talk, I will describe the RTF instructions required to produce a table,
and I will outline the approach that I take when generating an RTF file from
within a Stata program.
Automating reports with Mata and mail merge
Survey Design and Analysis
Mail merge is a convenient way of getting data into a report. It allows data
to be interspersed between normal report texts. Graphs and table data can
also be input with mail merge.
Where standard reports are required on a regular basis or large numbers of
reports are required, automating the reporting process saves time and
reduces the chance of errors.
Stata and Mata can generate an output that mail merge can read. This will be
demonstrated using a simple example. The commands and process of doing this
with Mata will be explained.
Complementing Stata with geovisualization
Philip S. Morrison
Victoria University of Wellington—New Zealand
Statistical agencies are increasingly recognizing the value of configuring
their data in formats that facilitate geovisualization—the representation
of data across the geographic domain. For Stata users this poses a challenge
because the present geovisualization capacity within the conventional
Stata product is quite limited.
This presentation reports on a project completed for Statistics New Zealand
on Geovisualization where graphical tools from Stata were complemented by
the geovisualization capacity afforded by GeoViz. The presentation
illustrates the returns to geovisualization via a specific case study and
considers the advantages that could potentially accrue to Stata users if
such a capacity was provided within the Stata system itself.
Working correlation structure and model selection in GEE analyses of longitudinal data
The GEE method is one of the most commonly used statistical methods in the
analysis of longitudinal data. A working correlation structure for the
repeated measures of the outcome variable of a subject needs to be specified
in this method. However, statistical criteria for selecting the best
correlation structure and the best subset of explanatory variables in GEE
are only available recently. Maximum likelihood–based model selection
methods, such as AIC, are not applicable directly to GEE.
Based on the QIC
method proposed by Pan (2001, Biometrics
57: 120–125), we systematically developed a general
computing program to calculate the QIC value for a range of different
distributions, link functions, and correlation structures. The QIC value can
be used to select both the best correlation structure and the best subset
of explanatory variables. The program was written in Stata software.
talk, I will introduce the QIC method and program, and I will demonstrate how to use
it to select the most parsimonious model in GEE analyses through several
Competing risks regression
Roberto G. Gutierrez