Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: R for Stata Users

From   Joseph Hilbe <>
Subject   st: Re: R for Stata Users
Date   Sun, 28 Feb 2010 09:05:30 -0700


I sent this message on Saturday, but it did not go through. I believe I must
have been using a rich text editor, which Statalist does not accept. I tried
again this morning, but recalled that i have gmail, which is pure text. I am
sending it through gmail in case aol still does not go through. If you get
this twice, my apologies.

I am not on the Statalist, but do take the Digest, so do not get the
listings until the following day. Most of the time I try to see what has
been discussed, sometimes i just don't have the time. Fortunately I looked
this morning.

Bob Muenchen of the Univ of Tennessee wrote a book a couple of years ago
titled "R for SAS and SPSS users"  The folks at both SPSS and SAS have
seemed to love it, once they realized that the book was aimed to help
SAS/SPSS users who also wanted to learn R. It was not written to convert
anyone from SAS/SPSS to R. Bob is a SAS user and has no intention of

The statistics editor at Springer contacted me about working with Bob for a
book to be titled "R for Stata Users" He knew that I added R code at the end
of the chapters of my then recently published "Logistic Regression Models"
(May 2009, Chapman & Hall/CRC) which - insofar as it was possible - was
aimed to produce output corresponding to the Stata examples I use throughout
the text. I initially did this to assist members of my classes with I teach Logistic Regression and Advanced Logistic
Regression, as well as a couple of other courses for them. Nearly all
"students" are professors who teach statistics courses in some discipline,
or active researchers wanting to update their knowledge of certain area of
statistics. Many -- perhaps even most -- of these students use R, with SAS
as the next most common sofware of preference. Very few come to class as
Stata users. From the feedback I get however, many of these students are so
impressed with what Stata can do that they end up as Stata users after the
class is over. They most definitely end up respecting Stata for its scope of
capabilities and ease of use. I have a 30 page tutorial on Stata as Appendix
A to help these students, and provide references to other places where they
can learn Stata, including the suite of Stata Press books. Man times I have
to tell them that there simply is no corresponding SAS, SPSS, or R function
available for some procedure we are discussing.

It is clear in Logistic Regression Models that Stata has for more modeling
capabilities in this area than is available in R. I have a couple of my
later chapters which have no R examples at the end of chapter, eg the
chapter on exact logistic regression.  But there are areas in which someone
has posted a library of functions to CRAN that is not available in Stata; eg
wavelets. I needed to write a NB2-NB1 hurdle model for a project a week ago.
Stata does not have a command for it, and I or anyone else I know has not
written one, but it is available using the flexmix function in R. This does
not happen much, but it can happen to any of us.

Now - to address the questions raised. I joined the project with Bob because
it is clear from email I get, from students and other profs I relate with,
and from what I see myself, that many - perhaps most - textbooks now being
published use R for examples. R is free and is not a commercial package.
Many university stat departments are now requiring that their students learn
R. And, from what I see being on the editorial boards of 7 journals now,
most examples used in Journals employ R.

What does this mean? Well, as a long time committed Stata user (some 22
years now) it means that if I am going to get the most from textbooks using
R for examples, and if I am to better understand articles using R for
examples, then I want to understand the basics of R.  If I am to better help
my R-using students to understand the Stata code and examples I use in my
books, I should know R so that i can use it to teach them how to understand
Stata and the examples. But there are some models that are not yet available
in Stata, but are available in R.

I didn't write the cover -- but the purpose of the book is to help Stata
users learn enough R to
1) better understand texts and journal articles employing R for examples,
2) to better be able to use R for the estimation of statistical procedures
that are currently unavailable in Stata. This includes how to set up
variables/observations, deal with missing values, and so forth.

I can't imagine anyone actually switching from Stata to R, unless they
simply have no money to purchase the software and do not have access to a
university site license. there is nowhere in the book that advocates such a
change. In fact, for portions of the book that I wrote, I compare Stata code
with R code for doing some operation or functions. Mostly Stata is easier  -
but sometimes not.

I myself find it much easier to use Stata than R for most commands and
operations. I too had trouble with the R "if"operator - because there isn't
any. this was difficult for me at first, but there are ways to perform the
operation that end up not so bad at all. However, Stata is more direct.

The foremost area of instruction in "R for Stata Users" is perhaps data
management. This is the area that is most difficult for Stata users trying
to interpret R code that is presented in a text or article. There are two
chapters on graphics and one on basic statistical commands, but nothing
beyond linear regression and ANOVA.

The book is NOT for Stata users who have no reason to learn R. If it were
not for me having so many students who are R users and having to present
materials aimed to teach various statistical methods, and if I did not want
to better understand texts and journal articles that use R, I would have no
reason at all for learning it. Also, I referee more articles than I have
time for, in addition to my AE responsibilities, and find that the majority
of manuscripts I get use R for their examples. In order to do a more
responsible job as referee I felt that I needed to learn R.  But that has no
bearing on what my preferred statistical package is for my own work. It is
clearly Stata - for a host of reasons. But I still find it useful to know R
as well. And that is the point of the book. The book was written for those
wanting to augment Stata, or to better understand sources that use R for
examples. I have found learning R useful at times, you may as well.

Joseph Hilbe

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index