Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Re: RE: Problem areas (R vs Stata)

From   James Muller <>
Subject   Re: st: RE: Re: RE: Problem areas (R vs Stata)
Date   Sat, 27 Aug 2005 11:37:58 +1000

Some rambling views on R vs. Stata. Read on if you're patient.

Here's why I think Stata is used widely: It has really tidy data manipulation features and is used widely (tautology, no?). Wide use implies people can help each other out, which reduces costs. Good easy-to-use flexible data manipulation features reduce the need to learn to handle a database server or use other software. Remove these things and you start using R.

R is something you have to study to learn. Bigger overhead than getting started with Stata. That said, Stata has its own `hump' in the learning curve right about where you want to start doing involved programming - this is what net courses are good for. Point is, if you want to take advantage of _either_ Stata or R, or any other decent data analysis environment, you have to spend time studying how to use it.

Following the R introductory guide will get you up and running with just as many of the begginer/intermediate functions used in Stata, but the syntax gets people I think. Compare three pairs of equivalent statements:

Stata: replace x = . if (x==0)
R: mydata$x[which(mydata$x==0)] <- na

Stata: do "mydo"
R: source("myRbatch.R", echo=TRUE)

Stata: regress y x1 x1
R: fitted.model <- lm(blah$y ~ blah$x1 + blah$x2)

And don't even ask about importing big CSV files. Note though that there are places where R is many, many times more concise than using regular Stata language - and also that this is a completely different story now that Mata exists.

As for the Stata GUI vs. the R GUI debate, I've never really got this. What Stata GUI? A few menus, a history and a variable list - that's it. The results menu and command input certainly don't count. R handles graphical output at least as well as Stata. I personally use ESS under Emacs to use R, which is a slightly different kettle, but even using the stock GUI I don't see major difference. I think it's a good thing neither Stata nor R have much in the way of a GUI - people should use scripts, programs and logged input for the body of their work, avoiding pointing and clicking wherever possible.

Finally, R has very minimal data manipulation abilities. The standard recommendation on the R-help list is that users with data import problems and data manipulation problems should probably set up a database server and put all their data in there, doing the manipulation of data mostly with the server rather than with R. Now, really... Very impractical for so many people for so many reasons.

Use R if you're poor or don't care about the differences. Use Stata if everyone else does or you don't want to learn to run a database server. Use either if there aren't any issues any way. If you like programming languages learn R's; if you have Stata learn Mata. If you like chalanges, do everything in C.

One thing about R is that it's free, which means that you can try it yourself with just the cost of some hair and coffee. Give it a go, really, but don't confuse the myths about R for its truths.

And finally, I have to say the R support community is bloody fantastic. Stata's is very good, but R's is really something. So never let anyone tell you R is poorly supported - that is absolutely false. Probably more netiquette involved on R-help, but that's fine.

Righto, hope that wasn't too much random walking


Nick Cox wrote:

Without wanting to be insensitive to those with incomes that mean Stata is too expensive -- which include many people in _all_ countries -- this is just not going to happen. Or so I surmise.
If you want open source Stata, you will have to recapitulate the development of Stata from scratch, which means getting a team up to speed on C programming, operating systems, low-level stuff, numerical
analysis, etc., etc. And then you have to watch copyright issues.
This is because StataCorp is a company based almost
totally on Stata, and they are just not going to throw their intellectual capital out into the world.
In principle, what you want could be done, as shown by the history of S-Plus and R. But that history has certain unique features, unlikely to be matched in the case of Stata.
And it would be interesting and indeed exciting if someone did it, but I doubt it.
Turn and turn about, why is not the whole statistical world not using R if it is free?
I guess there are several main reasons. Here
are a few:
1. The way R is set up is congenial to its developers but not to all possible users. This creates a feedback loop, as new code
has to fit in with existing code. R still
shows its origin in S, which is a programming
language first and foremost.
2. Many users want GUIs as well. The GUI of R, as I understand it, is minor.
3. Many users want and indeed need technical support. Somewhere I saw a comment from an R developer "Our idea of technical support
is that you support us", and that's fair enough. Naturally there are email lists etc. for R and people do help each other. But no one has the _duty_ to help you. For many users, that's crucial.
4. The inertia that comes from pre-existing
investment in people, locally-written programs, documentation etc. to do with a particular
program that is already in use in a particular
Manuel Chávez

I agree that STATA is the leader on statistical packages. Also I understand
that it has to find a way to support itself. Nevertheless, for some of us in
developing countries often is a burden not to be able to get the upgrades
and last versions. The truth is, I believe, that STATA will make much
greater benefit that what it does if it were open source. I'm sure that some
financial mechanism could be found to support the infraestructure and I
almost sure that it's development is not due to copy rights, and maybe a
faster and broader development could be reach in an open source format. Just
think on STATA as a vaccine, it is in some way a need.

* For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index