»  Home »  Disciplines »  Data science

Data science

Data scientists rely on Stata because of its strong programming capabilities, reproducibility, extensibility, and interoperability. From data wrangling to reporting, Stata provides the tools you need to accomplish your analyses. Permissive licensing allows you to easily integrate it into your proprietary workflow.




Features for data science

Data wrangling
Scrape data from the web, import it from standard formats, or pull it in via ODBC and SQL. Match-merge, append, reshape, transpose, sort, filter. Stata handles Unicode, BLOBs, regular expressions, and more, whether working with hundreds of thousands or even billions of data points. Read more.

Dynamic document generation
Use Markdown to create HTML files with embedded Stata code, output, and graphs. Automate Word, PDF, or Excel reports with both high-level export capabilities and low-level fine-grained programmatic access to automate production of the documents your team needs. Read more about Markdown, about Word Documents, about PDF documents, or about Excel.

Visualization
Create graphs and customize them programmatically or interactively with the Graph Editor. Edits can even be recorded and "replayed" on other graphs for reproducibility. Export to industry standard formats suitable for web (SVG, PNG) or print (PDF, EPS, PS). Read more.

Programming
Automate your entire workflow with both scripts and full-blown programming features like classes, structures, and pointers. A unique feature of Stata's programming environment is Mata, a fast and compiled matrix programming language. ... Of course, it has all the advanced matrix operations you need. It also has access to the power of LAPACK. What's more, it has built-in solvers and optimizers to make implementing your own estimator easier. And you can leverage all of Stata's estimation features and other features from within Mata. And much more. Read more

Interoperability
Connect to external code via Java and C++ plugins. Control Stata via OLE Automation or call it in batch mode. Write custom SQL statements to extract from or populate databases.

Statistics and modeling
Incorporate state-of-the-art statistical models and results in your workflow. Find groups in your data using unsupervised techniques including cluster analysis, principal components, factor analysis, multidimensional scaling, and correspondence analysis. Understand your groups even better using latent class analysis. ... When your analysis calls for supervised techniques, Stata has flexible nonparametric methods and an array of regression models from linear and logistic models to mixture models.

Stata keeps up when your data call for special techniques. You have access to methods that understand and take advantage of the structure in time series, panel data, survival data, complex survey data, spatial data, and multilevel data. Stata provides the most approachable implementations of Bayesian methods and structural equation modeling available anywhere. You can request bootstrap methods for virtually any estimator. When your analysis calls for it, Stata automates other replication methods and simulations. And much, much more.
Read more

Reproducibility
Stata is the only software for data science and statistical analysis featuring a comprehensive version control system that ensures your code continues to run, unaltered, even after updates or new versions are released. ... No need to keep around multiple legacy installations to avoid breaking your system; Stata code from 25 years ago can still be run without modification. Datasets, graphs, scripts, programs, and more are 100% cross-platform and backward compatible. Read more

Check out Stata's full list of features, or see what's new in Stata 15.

Why Stata?

Intuitive and easy to use.
Once you learn the syntax of one estimator, graphics command, and data management tool, you will effortlessly understand the rest.

Accuracy and reliability.
Stata is extensively and continually tested. Stata's tests produce approximately 4 million lines of output.

One package. No modules.
When you buy Stata, you obtain everything for your statistical, graphical, and data analysis needs. You do not need to buy separate modules or import your data to specialized software.

Write your own Stata programs.
You can easily write your own Stata programs and commands to share with others or to simplify your work using Stata's do-files, ado-files, and matrix-language program, Mata. Moreover, you can benefit from the thousands of Stata community-contributed programs.

Extensive documentation.
Stata offers 27 volumes with more than 14,000 pages of PDF documentation containing calculation formulas, detailed examples, references to the literature, and in-depth discussions. Stata's documentation is a great place to learn about Stata and the statistics, graphics, or data management tools you are using for your research.

Top-notch technical support.
Stata's technical support is known for their prompt, accurate, detailed, and clear responses. People answering your questions have master's and PhD degrees in relevant areas of research.

Learn more

Would you like to see Stata in action?

Join us for one of our free live webinars. Ready. Set. Go Stata shows you how to quickly get started manipulating, graphing, and analyzing your data. Or, go deeper in one of our special-topics webinars.

Would you like to see more?

Stata's YouTube has over 250 videos with playlists for a variety of methodologies important to data scientists. And they are a convenient teaching aid in the classroom.


Visit our channel

NetCourses: Online training made simple

Get started quickly at using Stata effectively, or even learn how to perform rigorous time-series, panel-data, or survival analysis, all from the comfort of you home or office. NetCourses make it easy.

For Stata users, by Stata users

Stata Press offers books with clear, step-by-step examples that make teaching easier and that enable students to learn and data scientists to implement the latest best practices in analysis.


Alan C. Acock

Alan C. Acock

Nicholas J. Cox

James W. Hardin and Joseph M. Hilbe

Ulrich Kohler and Frauke Kreuter

J. Scott Long and Jeremy Freese

Michael N. Mitchell

Michael N. Mitchell

Michael N. Mitchell

Sophia Rabe-Hesketh and Anders Skrondal

Michael N. Mitchell

Stata

Shop

Support

Company


The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn YouTube Instagram
© Copyright 1996–2018 StataCorp LLC   •   Terms of use   •   Privacy   •   Contact us