Home  /  Products  /  Features  /  Integrated version control

Stata takes reproducibility seriously.

Stata is the only statistical software package, commercial or open source, with integrated version control that allows scripts and programs written years ago to continue to work in modern versions of the software. If you wrote a script to perform an analysis in 1985, that same script will still run and still produce the same results today. Any dataset you created in 1985, you can read today. And the same will be true in the future. Stata will be able to run anything you do today.

Unlike other software, you do not have to keep multiple installations of old versions of Stata, hoping they will still run on a modern operating system, to be able to run code from years or decades before. You can simply use modern Stata, and it will understand any old code or dataset from the past.

Version control in Stata is seamless. Simply include a version statement at the beginning of your script or program, or prefix your command with version:, and you will be able to run it, without modification, in any future version of Stata.

For instance, in Stata 13 (released in 2013), to compute confidence intervals (CIs) for means of normally distributed variables y1 and y2, we used to type

. ci y1 y2

and to compute CIs for proportions of binary variables z1 and z2, we used to type

. ci z1 z2, binomial

In modern Stata, we would instead type

. ci means y1 y2

and

. ci proportions z1 z2

But rest assured that the old syntax still works. All you need to do is to prefix the old commands with the appropriate version statement. In our example, we could type

. version 13: ci y1 y2
. version 13: ci z1 z2, binomial

to run the old commands, as they are, in modern Stata.

The above works whether you work interactively by typing the commands, or you use the commands within a script or program:

program myci
     ...
     version 13: ci y1 y2
     ...
     version 13: ci z1 z2, binomial
     ...
end

You can version-control an entire program or script by simply including the appropriate version statement at the beginning:

program myci
     version 13
     ...
     ci y1 y2
     ...
     ci z1 z2, binomial
     ...
end

No broken scripts. No broken programs. No additional effort. Stata was designed from its very first version with reproducible research in mind. We want users to be confident that years down the road, the files they used to produce a particular analysis will continue to work even if they change operating systems or computer architecture and move to a much newer version of Stata.

For more information, see The Stata Blog: Compatibility and reproducibility and [P] version.