Home / Disciplines / Medical research

Medical research

Medical researchers rely on Stata for its range of biostatistical methods, reproducibility, and ease of use. Whether you are conducting basic medical research or carrying out a clinical trial, Stata provides the tools you need to conduct your study from power and sample-size calculations to data manipulation to analysis.

Features for medical researchers

General linear models
Fit one- and two-way models. Or fit models with three, four, or even more factors. Analyze data with nested factors, with fixed and random factors, or with repeated measures. Use ANCOVA models when you have continuous covariates and MANOVA models when you have multiple outcome variables. Further explore the relationships between your outcome and predictors by estimating effect sizes and computing least-squares and marginal means. Perform contrasts and pairwise comparisons. Analyze and plot interactions.

Linear, binary, and count regressions
Fit classical ANOVA and linear regression models of the relationship between a continuous outcome, such as weight, and the determinants of weight, such as height, diet, and level of exercise. If your response is binary, ordinal, categorical, or count, don't worry. Stata has estimators for these types of outcomes too. Use logistic regression to estimate odds ratios. Estimate incidence rates using a Poisson model. Analyze matched case–control data with conditional logistic regression. A vast array of tools is available after fitting such models. Predict outcomes and their confidence intervals. Test equality of parameters. Compute linear and nonlinear combinations of parameters.

Power, precision, and sample size
Before you conduct your experiment, determine the sample size needed to detect meaningful effects without wasting resources. Do you intend to compute CIs for means or variances or perform tests for proportions or correlations? Do you plan to fit a Cox proportional hazards model or compare survivor functions using a log-rank test? Do you want to use a Cochran—Mantel—Haenszel test of association or a Cochran—Armitage trend test? Use Stata's power command to compute power and sample size, create customized tables, and automatically graph the relationships between power, sample size, and effect size for your planned study. Or use the ciwidth command to do the same but for CIs instead of hypothesis tests by computing the required sample size for the desired CI precision. Or use gsdesign to compute stopping boundaries and the required sample sizes for group sequential designs. Instead of commands, use the interactive Control Panel to perform your analysis.

Marginal means, contrasts, and interactions
Marginal means and contrasts let you analyze the relationships between your outcome variable and your covariates, even when that outcome is binary, count, ordinal, categorical, or survival. Compute adjusted predictions with covariates set to interesting or representative values. Or compute marginal means for each level of a categorical covariate. Make comparisons of the adjusted predictions or marginal means using contrasts. If you have multilevel data and random effects, these effects are automatically integrated out to provide marginal (that is, population-averaged) estimates. After fitting almost any model in Stata, analyze the effect of covariate interactions, and easily create plots to visualize those interactions.

Multilevel mixed-effects models
Whether the groupings in your data arise in a nested fashion (patients nested in clinics and clinics nested in regions) or in a nonnested fashion (regions crossed with occupations), you can fit a multilevel model to account for the lack of independence within these groups. Fit models for continuous, binary, count, ordinal, and survival outcomes. Estimate variances of random intercepts and random coefficients. Compute intraclass correlations. Predict random effects. Estimate relationships that are population averaged over the random effects.

Meta-analysis
Combine results of multiple studies to estimate an overall effect. Use forest plots to visualize results. Use subgroup analysis and meta-regression to explore study heterogeneity. Use funnel plots and formal tests to explore publication bias and small-study effects. Use trim-and-fill analysis to assess the impact of publication bias on results. Perform cumulative and leave-one-out meta-analysis. Perform univariate, multilevel, and multivariate meta-analysis. Use the meta suite, or let the Control Panel interface guide you through your entire meta-analysis.

Multiple imputation
Account for missing data in your sample using multiple imputation. Choose from univariate and multivariate methods to impute missing values in continuous, censored, truncated, binary, ordinal, categorical, and count variables. Then, in a single step, estimate parameters using the imputed datasets, and combine results. Fit a linear model, logit model, Poisson model, hierarchical model, survival model, or one of the many other supported models. Use the mi command, or let the Control Panel interface guide you through your entire MI analysis.

Survival analysis
Analyze duration outcomes—outcomes measuring the time to an event such as failure or death—using Stata's specialized tools for survival analysis. Account for the complications inherent in survival data, such as sometimes not observing the event (right-, left-, and interval-censoring), individuals entering the study at differing times (delayed entry), and individuals who are not continuously observed throughout the study (gaps). You can estimate and plot the probability of survival over time. Or model survival as a function of covariates using Cox, Weibull, lognormal, and other regression models. Predict hazard ratios, mean survival time, and survival probabilities. Do you have groups of individuals in your study? Adjust for within-group correlation with a random-effects or shared-frailty model. When you have interval-censored multiple-event data, you can fit a marginal Cox model. If you have many potential covariates, use lasso cox and elasticnet cox for model selection and prediction.

Epidemiological tables
Want to analyze data from a prospective (incidence) study, cohort study, case–control study, or matched case–control study? Stata's tables for epidemiologists make it easy to summarize your data and compute statistics such as incidence-rate ratios, incidence-rate differences, risk ratios, risk differences, odds ratios, and attributable fractions. You can analyze stratified data too—compute Mantel–Haenszel combined estimates, perform tests of homogeneity, and standardize estimates. If you have an ordinal rather than binary exposure, you can perform a test for a trend.

Additive models of relative risk
Determine how exposures interact to put subjects at a higher risk of experiencing an outcome of interest. For example, you might be investigating how exposure to cigarette smoke and asbestos interact to increase the risk of lung cancer. With Stata's reri command, you can measure two–way interactions in an additive model of relative risk, while accounting for other risk factors. Choose from various supported models, such as binomial generalized linear, Poisson, negative binomial, logistic, Cox, parametric survival, and interval–censored parametric and semiparametric survival models. Estimate the relative excess risk due to interaction (RERI), attributable proportion (AP), and synergy index (SI).

Automated reporting and customizable tables
Stata is designed for reproducible research, including the ability to create dynamic documents incorporating your analysis results. Create Word or PDF files, populate Excel worksheets with results and format them to your liking, and mix Markdown, HTML, Stata results, and Stata graphs, all from within Stata. Create tables that compare regression results or summary statistics, use default styles or apply your own, and export your tables to Word, PDF, HTML, LaTeX, Excel, or Markdown and include them in your reports.

Machine learning
With machine learning via H2O, you can use ensemble decision trees—random forests and gradient boosting machines—for regression and classification. Or use lasso for sparse regression and classification. Or use Bayesian variable selection or Bayesian model averaging to select predictors in a linear model. For causal inference with machine learning, use double-selection lasso, partialing-out lasso, and double machine learning. You can use PCA or kmeans, kmedians, or hierarchical clustering for unsupervised learning. And use search to find community-contributed commands for neural networks, support vector machines, graphical lasso, text mining, and more.

Jupyter Notebook with Stata
Jupyter Notebook is widely used by researchers and scientists to share their ideas and results for collaboration and innovation. It is an easy-to-use web application that allows you to combine code, visualizations, mathematical formulas, narrative text, and other rich media in a single document (a "notebook") for interactive computing and developing. You can invoke Stata and Mata from Jupyter Notebook with the IPython (interactive Python) kernel. This means you can combine the capabilities of both Python and Stata in a single environment to make your work easily reproducible and shareable with others.

Reproducibility
Stata is the only software for data science and statistical analysis featuring a comprehensive integrated versioning that ensures your code continues to run, unaltered, even after updates or new versions are released. No need to keep around multiple legacy installations to avoid breaking your system; Stata code from 40 years ago can still be run without modification. Datasets, graphs, scripts, programs, and more are 100% cross-platform and backward compatible.

I've used a lot of stat packages over the years, but I find that I'm using Stata 95% of the time now. It's wonderful! Its speed and power are much touted, but its simplicity for beginners is perhaps one of its best features.

— Rodney Hayward
University of Michigan's Schools of Medicine & Public Health, Ann Arbor VA's Center for Clinical Management Research

Check out Stata's full list of features, or see what's new in Stata 19.

Why Stata?

Intuitive and easy to use.
Once you learn the syntax of one estimator, graphics command, or data manipulation tool, you will effortlessly understand the rest.

Accuracy, reliability, and reproducibility.
Stata is extensively and continually tested. Stata's tests produce approximately 7.2 million lines of testing code. Each of those lines is compared against known-to-be-accurate results across editions of Stata and every operating system Stata supports to ensure accuracy and reproducibility, including integrated versioning for backwards compatibility.

One package. No modules.
When you buy Stata, you obtain everything for your statistical, graphical, and data analysis needs. You do not need to buy separate modules or import your data to specialized software.

Write your own Stata programs.
You can easily write your own Stata programs and commands. Share them with others or use them to simplify your work. Utilize Stata's do-files, ado-files, and Mata: Stata's own advanced programming language that adds direct support for matrix programming. You can also access and benefit from the thousands of existing Stata community-contributed programs.

Extensive documentation.
Stata offers 36 manuals with more than 19,000 pages of PDF documentation containing detailed examples, in-depth discussions, references to relevant literature, and methods and formulas. Stata's documentation is a great place to learn about Stata and the statistics, graphics, data manipulation, and data science tools you are using for your research.

Top-notch technical support.
Stata's technical support is known for their prompt, accurate, detailed, and clear responses. People answering your questions have master's and PhD degrees in relevant areas of research.

Learn more

Would you like to see Stata in action?

Join us for one of our free live webinars. Ready. Set. Go Stata shows you how to quickly get started manipulating, graphing, and analyzing your data. Or, go deeper in one of our special-topics webinars.

Would you like to see more?

Stata's YouTube has over 300 videos with a dedicated playlist of methodologies important to medical researchers. And they are a convenient teaching aid in the classroom.

Visit our channel

NetCourses: Online training made simple

Get started quickly at using Stata effectively, or even learn how to perform rigorous time-series, panel-data, or survival analysis, all from the comfort of you home or office. NetCourses make it easy.

For Stata users, by Stata users

Stata Press offers books with clear, step-by-step examples that make teaching easier and that enable students to learn and medical researchers to implement the latest best practices in analysis.

Alan C. Acock

Franz Buscha

Nicholas J. Cox

Svend Juul and Morten Frydenberg

Ulrich Kohler and Frauke Kreuter

J. Scott Long and Jeremy Freese

Michael N. Mitchell

Sophia Rabe-Hesketh and Anders Skrondal

Tom M. Palmer and Jonathan A. C. Sterne (editors)

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies

Stata/MP4 Annual License (download)