»  Home »  Disciplines »  Epidemiology

# Epidemiology

Epidemiologists have relied on Stata for over 30 years because of its specialized epidemiologic commands, accuracy, and ease of use. Whether you are researching infectious diseases, investigating exposure to pathogens, or studying chronic diseases, Stata provides the data management and statistical tools to support your research. It also gives you the ability to make publication-quality graphics so you can clearly display your findings.

## Features for epidemiologists

Epidemiological tables
Want to analyze data from a prospective (incidence) study, cohort study, case–control study, or matched case-control study? Stata's tables for epidemiologists make it easy to summarize your data and compute statistics such as incidence-rate ratios, incidence-rate differences, risk ratios, risk differences, odds ratios, and attributable fractions. You can analyze stratified data too—compute Mantel–Haenszel combined estimates, perform tests of homogeneity, and standardize estimates. If you have an ordinal rather than binary exposure, you can perform a test for a trend. And much more.

Survival analysis
Analyze duration outcomes—outcomes measuring the time to an event such as failure or death—using Stata's specialized tools for survival analysis. Account for the complications inherent in survival data, such as sometimes not observing the event (censoring), individuals entering the study at differing times (delayed entry), and individuals who are not continuously observed throughout the study (gaps). You can estimate and plot the probability of survival over time. Or model survival as a function of covariates using Cox, Weibull, lognormal, and other regression models. Predict hazard ratios, mean survival time, and survival probabilities. Do you have groups of individuals in your study? Adjust for within-group correlation with a random-effects or shared frailty model.

Linear, binary, and count regressions
Fit classical ANOVA and linear regression models of the relationship between a continuous outcome, such as weight, and the determinants of weight, such as height, diet, and level of exercise. If your response is binary, ordinal, categorical, or count, don't worry. Stata has estimators for these types of outcomes too. Use logistic regression to adjust odds ratios for confounding variables. Estimate incidence rates using a Poisson model. Analyze matched case–control data with conditional logistic regression. A vast array of tools is available after fitting such models. Predict outcomes and their confidence intervals. Test equality of parameters. Compute linear and nonlinear combinations of parameters. And much more.

Survey methods
Whether your data require a simple weighted adjustment because of differential sampling rates or you have data from a complex multistage survey, Stata's survey features can provide you with correct standard errors and confidence intervals for your inferences. Simply specify the relevant characteristics of your sampling design, such as sampling weights (including weights at multiple stages), clustering (at one, two, or more stages), stratification, and poststratification. After that, most of Stata's estimation commands can adjust their estimates to correct for your sampling design. Learn more.

Marginal means, contrasts, and interactions
Marginal means and contrasts let you analyze the relationships between your outcome variable and your predictors, even when your outcome is binary, count, ordinal, or categorical. For instance, after you fit a logistic regression of a disease on an exposure variable and other covariates, your marginal means may be population-averaged risks. Or you can set the covariates to interesting values to compute adjusted risks and then use contrasts to get adjusted risk differences. After fitting almost any model in Stata, you can analyze the effect of covariate interactions and easily create plots to visualize those interactions.

Power and sample size
Before you conduct your experiment, determine the sample size needed to detect meaningful effects without wasting resources. Do you intend to perform tests of means, variances, proportions, or correlations? Do you plan to fit a Cox proportional-hazards model or compare survivor functions using a log-rank test or exponential regression? Do you want to use a Cochran–Mantel–Haenszel test of association or a Cochran–Armitage trend test? Use Stata's power commands or interactive Control Panel to compute power and sample size, create customized tables, and automatically graph the relationships between power, sample size, and effect size for your planned study.

Causal inference
Estimate experimental-style causal effects from observational data. With Stata's treatment-effect estimators, we can use a potential-outcomes (counterfactuals) framework to estimate, for instance, the effect of a health education program in schools on teenage smoking. Fit models for continuous, binary, count, fractional, and survival outcomes with binary or multivalued treatments using inverse-probability weighting (IPW), propensity-score matching, nearest-neighbor matching, regression adjustment, or doubly robust estimators. If the assignment to a treatment is not independent of the outcome, you can use an endogenous treatment-effects estimator. And much more.

Multiple imputation
Account for missing data in your sample using multiple imputation. Choose from univariate and multivariate methods to impute missing values in continuous, censored, truncated, binary, ordinal, categorical, and count variables. Then, in a single step, estimate parameters using the imputed datasets, and combine results. Fit a linear model, logit model, Poisson model, multilevel model, survival model, or one of the many other supported models. Use the mi command, or let the Control Panel interface guide you through your entire MI analysis.

Multilevel mixed-effects models
Whether the groupings in your data arise in a nested fashion (patients nested in clinics and clinics nested in regions) or in a nonnested fashion (regions crossed with occupations), you can fit a multilevel model to account for the lack of independence within these groups. Fit models for continuous, binary, count, ordinal, and survival outcomes. Estimate variances of random intercepts and random coefficients. Compute intraclass correlations. Predict random effects. Estimate relationships that are population averaged over the random effects. And much more.

Bayesian analysis
Fit Bayesian regression models using one of the Markov chain Monte Carlo (MCMC) methods. You can choose from a variety of supported models or even program your own. Extensive graphical tools are available to check convergence visually. Compute posterior mean estimates and credible intervals for model parameters and functions of model parameters. You can perform both interval- and model-based hypothesis testing. Compare models using Bayes factors. And much more.

Dynamic documents
Stata is designed for reproducible research, including the ability to create dynamic documents incorporating your analysis results. Create Word or PDF files, populate Excel worksheets with results and format them to your liking, and mix Markdown, HTML, Stata results, and Stata graphs, all from within Stata. And much more.

There is a lot to like about Stata, but for an epidemiologist the ease of use of the svy commands is not matched in any other package.

— George Savva
School of Health Sciences, University of East Anglia

## Why Stata?

Intuitive and easy to use.
Once you learn the syntax of one estimator, graphics command, and data management tool, you will effortlessly understand the rest.

Accuracy and reliability.
Stata is extensively and continually tested. Stata's tests produce approximately 4 million lines of output.

One package. No modules.
When you buy Stata, you obtain everything for your statistical, graphical, and data analysis needs. You do not need to buy separate modules or import your data to specialized software.

You can easily write your own Stata programs and commands to share with others or to simplify your work using Stata's do-files, ado-files, and matrix-language program, Mata. Moreover, you can benefit from the thousands of Stata user-written programs.

Extensive documentation.
Stata offers 27 volumes with more than 14,000 pages of PDF documentation containing calculation formulas, detailed examples, references to the literature, and in-depth discussions. Stata's documentation is a great place to learn about Stata and the statistics, graphics, or data management tools you are using for your research.

Top-notch technical support.
Stata's technical support is known for their prompt, accurate, detailed, and clear responses. People answering your questions have master's and PhD degrees in relevant areas of research.

## We can show you how

Stata's YouTube has over 100 videos with a dedicated playlist of methodologies important to epidemiologists. And they are a convenient teaching aid in the classroom.

## NetCourses: Online training made simple

Learn how to perform rigorous panel-data analysis or univariate time series, all from the comfort of your home or office. NetCourses make it easy.

## For Stata users, by Stata users

Stata Press offers books with clear, step-by-step examples that make teaching easier and that enable students to learn and epidemiologists to implement the latest best practices in analysis.

Alan C. Acock

Alan C. Acock

Nicholas J. Cox

Svend Juul and Morten Frydenberg

Ulrich Kohler and Frauke Kreuter

J. Scott Long and Jeremy Freese

Michael N. Mitchell

Michael N. Mitchell

Michael N. Mitchell

Sophia Rabe-Hesketh and Anders Skrondal