Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE; R for Stata Users

Subject   st: RE; R for Stata Users
Date   Sun, 28 Feb 2010 10:52:51 -0500

I am not on the Statalist, but do take the Digest, so do not get the listings until the following day. Most of the time I try to see what has been discussed, sometimes i just don't have the time. Fortunately I looked this morning.

Bob Muenchen of the Univ of Tennessee wrote a book a couple of years ago titled "R for SAS and SPSS users" The folks at both SPSS and SAS have seemed to love it, once they realized that the book was aimed to help SAS/SPSS users who also wanted to learn R. It was not written to convert anyone from SAS/SPSS to R. Bob is a SAS user and has no intention of changing.

The statistics editor at Springer contacted me about working with Bob for a book to be titled "R for Stata Users" He knew that I added R code at the end of the chapters of my then recently published "Logistic Regression Models" (May 2009, Chapman & Hall/CRC) which - insofar as it was possible - was aimed to produce output corresponding to the Stata examples I use throughout the text. I initially did this to assist members of my classes with I teach Logistic Regression and Advanced Logistic Regression, as well as a couple of other courses for them. Nearly all "students" are professors who teach statistics courses in some discipline, or active researchers wanting to update their knowledge of certain area of statistics. Many -- perhaps even most -- of these students use R, with SAS as the next most common sofware of preference. Very few come to class as Stata users. From the feedback I get however, many of these students are so impressed with what Stata can do that they end up as Stata users after the class is over. They most definitely end up respecting Stata for its scope of capabilities and ease of use. I have a 30 page tutorial on Stata as Appendix A to help these students, and provide references to other places where they can learn Stata, including the suite of Stata Press books. Man times I have to tell them that there simply is no corresponding SAS, SPSS, or R function available for some procedure we are discussing.

It is clear in Logistic Regression Models that Stata has for more modeling capabilities in this area than is available in R. I have a couple of my later chapters which have no R examples at the end of chapter, eg the chapter on exact logistic regression. But there are areas in which someone has posted a library of functions to CRAN that is not available in Stata; eg wavelets. I needed to write a NB2-NB1 hurdle model for a project a week ago. Stata does not have a command for it, and I or anyone else I know has not written one, but it is available using the flexmix function in R. This does not happen much, but it can happen to any of us.

Now - to address the questions raised. I joined the project with Bob because it is clear from email I get, from students and other profs I relate with, and from what I see myself, that many - perhaps most - textbooks now being published use R for examples. R is free and is not a commercial package. Many university stat departments are now requiring that their students learn R. And, from what I see being on the editorial boards of 7 journals now, most examples used in Journals employ R.

What does this mean? Well, as a long time committed Stata user (some 22 years now) it means that if I am going to get the most from textbooks using R for examples, and if I am to better understand articles using R for examples, then I want to understand the basics of R. If I am to better help my R-using students to understand the Stata code and examples I use in my books, I should know R so that i can use it to teach them how to understand Stata and the examples. But there are some models that are not yet available in Stata, but are available in R.

I didn't write the cover -- but the purpose of the book is to help Stata users learn enough R to 1) better understand texts and journal articles employing R for examples, and 2) to better be able to use R for the estimation of statistical procedures that are currently unavailable in Stata. This includes how to set up variables/observations, deal with missing values, and so forth.

I can't imagine anyone actually switching from Stata to R, unless they simply have no money to purchase the software and do not have access to a university site license. there is nowhere in the book that advocates such a change. In fact, for portions of the book that I wrote, I compare Stata code with R code for doing some operation or functions. Mostly Stata is easier - but sometimes not.

I myself find it much easier to use Stata than R for most commands and operations. I too had trouble with the R "if"operator - because there isn't any. this was difficult for me at first, but there are ways to perform the operation that end up not so bad at all. However, Stata is more direct.

The foremost area of instruction in "R for Stata Users" is perhaps data management. This is the area that is most difficult for Stata users trying to interpret R code that is presented in a text or article. There are two chapters on graphics and one on basic statistical commands, but nothing beyond linear regression and ANOVA.

The book is NOT for Stata users who have no reason to learn R. If it were not for me having so many students who are R users and having to present materials aimed to teach various statistical methods, and if I did not want to better understand texts and journal articles that use R, I would have no reason at all for learning it. Also, I referee more articles than I have time for, in addition to my AE responsibilities, and find that the majority of manuscripts I get use R for their examples. In order to do a more responsible job as referee I felt that I needed to learn R. But that has no bearing on what my preferred statistical package is for my own work. It is clearly Stata - for a host of reasons. But I still find it useful to know R as well. And that is the point of the book. The book was written for those wanting to augment Stata, or to better understand sources that use R for examples.

Joseph Hilbe

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index