Take full advantage of the extra information that panel data provide while simultaneously handling the peculiarities of panel data. Study the time-invariant features within each panel, the relationships across panels, and how outcomes of interest change over time. Fit linear models or nonlinear models for binary, count, ordinal, censored, or survival outcomes with fixed-effects, random-effects, or population-averaged estimators. Fit dynamic models or models with endogeneity. And much more.
Handle the statistical challenges inherent to time-series data—autocorrelations, common factors, autoregressive conditional heteroskedasticity, unit roots, cointegration, and much more. Analyze univariate time series using ARIMA, ARFIMA, Markov-switching models, ARCH and GARCH models, and unobserved-components models. Analyze multivariate time series using VAR, structural VAR, VEC, multivariate GARCH, dynamic-factor models, and state-space models. Compute and graph impulse responses. Test for unit roots. And much more.
Fit classical linear models of the relationship between a continuous outcome, such as wage, and the determinants of wage, such as education level, age, experience, and economic sector. If your response is binary (for example, employed or unemployed), ordinal (education level), count (number of children), or censored (ticket sales in an existing venue), don't worry. Stata has maximum likelihood estimators—probit, ordered probit, Poisson, tobit, and many others—that estimate the relationship between such outcomes and their determinants. A vast array of tools is available to analyze such models. Predict outcomes and their confidence intervals. Test equality of parameters, or any linear or nonlinear combination of parameters. And much more.
Endogeneity and selection
When explanatory variables are related to omitted observable variables, or when they are related to unobservable variables, or when there is selection bias, then causal relationships are confounded and parameter estimates from standard estimators produce inconsistent estimates of the true relationships. Stata can fit consistent models when there is such endogeneity or selection—whether your outcome variable is continuous, binary, count, or ordinal and whether your data are cross-sectional or panel. And much more.
Estimate experimental-style causal effects from observational data. For instance, the effect of a job training program on employment or the effect of a subsidy on production. Fit models for continuous, binary, count, fractional, and survival outcomes with binary or multivalued treatments using inverse-probability weighting (IPW), propensity-score matching, nearest-neighbor matching, regression adjustment, or doubly robust estimators. Fit models with exogenous or endogenous treatments. After estimation, test the overlap assumption and covariate balance. And much more.
Marginal effects and marginal means
Marginal effects and marginal means let you analyze and visualize the relationships between your outcome variable and your covariates, even when that outcome is binary, count, ordinal, categorical, or censored (tobit). Estimate population-averaged marginal effects or evaluate marginal effects at interesting or representative values of the covariates. Analyze the effect of interactions. You can even trace out the marginal effect over a range of interesting covariate values or covariate interactions. You can do all of this with marginal means (sometimes called potential outcome means), even when your “mean” is a probability of a positive outcome or a count from a Poisson model. If you have panel data and random effects, these effects are automatically integrated out to provide marginal (that is, population-averaged) effects. And much more.
GMM (generalized method of moments) can be used to fit almost any statistical model, including both exactly identified and overidentified estimation problems. Overidentified problems arise when you have endogeneity, correlation in dynamic panels, sample selection, and many other situations. With Stata, you estimate these models by simply writing your moments and enclosing the parameters in curly braces. You can easily fit cross-sectional, time-series, panel-data, or survival-data models and test your overidentifying restrictions. And much more.
Programming and matrix programming
Want to program your own commands to perform estimation, perform data management, or implement other new features? Stata is so programmable that thousands of Stata users have implemented and published thousands of user-written commands. These commands look and act just like official Stata commands. A unique feature of Stata's programming environment is Mata, a fast and compiled matrix programming language. Of course, it has all the advanced matrix operations you need. It also has access to the power of LAPACK. What's more, it has built-in solvers and optimizers to make implementing your own maximum likelihood, GMM, or other estimators easier. And you can leverage all of Stata's estimation and other features from within Mata. Many of Stata's official commands are themselves implemented in Mata. And much more.
Build multiequation models, and produce forecasts of levels, trends, rates, etc. Whether you have a small model with a few equations or a complete model of the economy with thousands of equations, Stata can help you build that model and produce forecasts. Your model can include both estimated relationships and known identities. You can easily create and compare forecasts under different scenarios, create static and dynamic forecasts, and even estimate stochastic confidence intervals. You can create your model by using an intuitive command syntax or by using the interactive forecasting control panel. And much more.
Analyze duration outcomes—outcomes measuring the time to an event such as failure or death—using Stata's specialized tools for survival analysis. Account for the complications inherent in survival data, such as sometimes not observing the event (censoring), individuals entering the study at differing times (delayed entry), and individuals who are not continuously observed throughout the study (gaps). You can estimate and plot the probability of survival over time. Or model survival as a function of covariates using Cox, Weibull, lognormal, and other regression models. Predict hazard ratios, mean survival time, and survival probabilities. Do you have groups of individuals in your study? Adjust for within-group correlation with a random-effects or shared frailty model. And much more.
Fit Bayesian regression models using a Metropolis–Hastings Markov chain Monte Carlo (MCMC) method. You can choose from a variety of supported models or even program your own. Extensive graphical tools are available to check convergence visually. Compute posterior mean estimates and credible intervals for model parameters and functions of model parameters. You can perform both interval- and model-based hypothesis testing. Compare models using Bayes factors. And much more.
Whether your data require a simple weighted adjustment because of differential sampling rates or you have data from a complex multistage survey, Stata's survey features can provide you with correct standard errors and confidence intervals for your inferences. Simply specify the relevant characteristics of your sampling design, such as sampling weights (including weights at multiple stages), clustering (at one, two, or more stages), stratification, and poststratification. After that, most of Stata's estimation commands can adjust their estimates to correct for your sampling design. And much more.
Over many years, Stata has been the one constant in a perpetually changing software toolbox. For me, it remains the fastest and most thorough tool for fully understanding a complex dataset. Plus it’s the easiest tool to extend and customize. I can’t imagine working without it.
— Sean Becketti
Financial industry veteran with three decades of experience
in academics, government, and private industry
Intuitive and easy to use.
Once you learn the syntax of one estimator, graphics command, and data management tool, you will effortlessly understand the rest.
Accuracy and reliability.
Stata is extensively and continually tested. Stata's tests produce approximately 4 million lines of output.
One package. No modules.
When you buy Stata, you obtain everything for your statistical, graphical, and data analysis needs. You do not need to buy separate modules or import your data to specialized software.
Write your own Stata programs.
You can easily write your own Stata programs and commands to share with others or to simplify your work using Stata's do-files, ado-files, and matrix-language program, Mata. Moreover, you can benefit from the thousands of Stata user-written programs.
Stata offers 22 volumes with more than 12,000 pages of PDF documentation containing calculation formulas, detailed examples, references to the literature, and in-depth discussions. Stata's documentation is a great place to learn about Stata and the statistics, graphics, or data management tools you are using for your research.
Top-notch technical support.
Stata's technical support is known for their prompt, accurate, detailed, and clear responses. People answering your questions have master's and PhD degrees in relevant areas of research.
Stata's YouTube has over 100 videos with a dedicated playlist of methodologies important to economists. And they are a convenient teaching aid in the classroom.
Stata Press offers books with clear, step-by-step examples that make teaching easier and that enable students to learn and economists to implement the latest best practices in analysis.