Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Muenchen, Robert A (Bob)" <muenchen@utk.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: R for Stata Users |

Date |
Sun, 28 Feb 2010 13:30:13 -0500 |

Dear Statalisters, As Joe mentioned, our goal for this book was to help make it as painless as possible (no small feat!) for Stata users to learn to use enough R to enable them to do things that Stata doesn't yet do. When I learned R, the toughest time I had was with its terminology and translating what I knew about other stat packages into the rather alien world of R. We hope that by explaining R using Stata terminology, it will be easier to learn. For example, R has several data structures, one of which is a "data frame." For a beginning understanding of that, it's essentially a Stata data set. However, eventually we need a more precise R-based definition: it's a list of vectors of the same length. But what's a list? How does R define length? With or without missing values? Without a doubt, the hardest thing about writing this book was choosing the order in which to introduce R's terminology; it is hard to define one term without using a couple of others! David Airey raised an excellent question: if you need R to do the odd selection of things Stata does not yet do, why bother to write a book that focuses on the basics (data management, graphics, stats)? You already know how to do those things in Stata, and you're likely to continue to do so! The answer is that it's hard to know where to draw the line for a particular task. For example, to do a model in R, you might so something like this: library("Hmisc") # Contains stata.get function. library("OtherLibrariesYouNeed") # If you need any. mydata <- stata.get("mydata.dta") # imports your Stata file mymodel <- TheFunctionYouNeed( y ~ x1+x2, data=mydata ) summary(mymodel) Now that you've done your model, you're likely to want to do some diagnostic plots. All Stata users know that it offers superb publication-quality plots, so you could pop the data back to Stata and do them there. Or perhaps you might want to stay in R and run this: plot(mymodel) That will get you several nice diagnostic plots, but it brings up whole new area. How much do you want to do in R before returning to Stata? Now, when you're done modeling, you'll probably want to do the plots you want to publish in Stata. That'll save you the effort learning to control titles, etc. in R. There are similar decisions to make in every analysis. There are simple things R will do for you to save you hopping back and forth. I was greatly amused by the R quote from Elan Cohen, "It's easy to do difficult things but hard to do simple things." There are cases where that is indeed true. We actually spend an entire chapter on the simple act of selecting variables! Some R users would argue that since you use the same methods selecting variables that you use when selecting observations that it's not much added effort for a more flexible approach. While we mention that in the book, I still prefer Stata's much simpler approach. How much flexibility do you need for this simple task? On the other hand, it is often R's help files that make it appear that simple things are difficult. For example, the command to print a file is described in the help file as: "print prints its argument and returns it invisibly (via invisible(x)). It is a generic function which means that new printing methods can be easily added for new classes." What's that invisible thing about? Does it display the data or not? However, using it is as simple as: print(mydata) The R help file on the simple act of sorting is, to me, fairly incomprehensible. But to do it is simple using R's "order" function. There is a "sort" function, but it doesn't do what a Stata user would think. The differences in terminology can be very confusing at first! It has been a pleasure working on this book with a pro like Joe Hilbe. I hope that we have succeeded in our goal of making R easier to learn for Stata users. If you're curious, the Table of Contents and general coverage is very similar to the one for "R for SAS and SPSS Users" that you can see at: http://www.amazon.com/SAS-SPSS-Users-Statistics-Computing/dp/0387094172/ ref=sr_1_1?ie=UTF8&s=books&qid=1267377061&sr=8-1 You can also download all our R and Stata programs from http://r4stats.com and see how they compare. Keep in mind though that we often do far more in the R programs than we do in Stata, so when the programs are longer, that doesn't mean R was more complex on that subject. But for selecting variables, the Stata program was shorter for a very good reason: it's much easier. Cheers, Bob ========================================================= Bob Muenchen (pronounced Min'-chen), Manager Research Computing Support Voice: (865) 974-5230 Email: muenchen@utk.edu Web: http://oit.utk.edu/research, News: http://oit.utk.edu/research/news.php Feedback: http://oit.utk.edu/feedback/ ========================================================= > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > statalist@hsphsun2.harvard.edu] On Behalf Of Airey, David C > Sent: Friday, February 26, 2010 9:43 AM > To: statalist@hsphsun2.harvard.edu > Subject: re: st: R for Stata Users > > . > > Scott Merryman posted 1/14/10 with no replies: > > > R for Stata Users > > Series: Statistics and Computing > > Muenchen, Robert A., Hilbe, Joseph M. > > > > 2010, Approx. 565 p., Hardcover > > > > ISBN: 978-1-4419-1317-3 > > > > Not yet published. Available: March 21, 2010 > > > > > > http://www.springer.com/statistics/computational/book/978-1-4419- > 1317-3 > > Apparently Stata has "jargon", and R has "formal" terminology. R also > has "publication quality graphs". Why does everyone keep saying this as > if these are somehow unavailable in other packages? It is easier to > make a crappy graph in R than Stata in my hands at least. Also, if I > were a professional Stata user, why the hell would I want to know how > to do basic statistics and basic graphs in R? There are now 900 books > describing such. Rather I might want to know how to use R to extend > Stata with packages unavailable in Stata and outside my expertise to do > so, period. > > Maybe one of the authors can comment on the title. > > --snip-- > * Read data from various types of text files and Stata data sets > * Manage your data through transformations, recodes, and > combining data sets from both the add-cases and add-variables > approaches and restructuring data from wide to long formats and vice > versa > * Create publication quality graphs including bar, histogram, > pie, line, scatter, regression, box, error bar, and interaction plots > * Perform the basic types of analyses to measure strength of > association and group differences and be able to know where to turn to > cover much more complex methods > > Stata is the most flexible and extensible data analysis package > available from a commercial vendor. R is a similarly flexible free and > open source package for data analysis, with over 3,000 add-on packages > available. This book shows you how to extend the power of Stata through > the use of R. It introduces R using Stata terminology with which you > are already familiar. It steps through more than 30 programs written in > both languages, comparing and contrasting the two packages' different > approaches. When finished, you will be able to use R in conjunction > with Stata, or separately, to import data, manage and transform it, > create publication quality graphics, and perform basic statistical > analyses. > > A glossary defines over 50 R terms using Stata jargon and again using > more formal R terminology. The table of contents and index allow you to > find equivalent R functions by looking up Stata commands and vice > versa. The example programs and practice datasets for both R and Stata > are available for download. > > Robert A. Muenchen is the author of the book, R for SAS and SPSS Users, > and is a consulting statistician with 29 years of experience. He has > served on the advisory boards of SAS Institute, SPSS Inc., and the > Statistical Graphics Corporation. He currently manages Research > Computing Support at The University of Tennessee. > > Joseph M. Hilbe is Solar System Ambassador with NASA/Jet Propulsion > Laboratory, California Institute of Technology, an adjunct professor of > statistics at Arizona State, and emeritus professor at the University > of Hawaii. He is a Fellow of the American Statistical Association and > elected member of the International Statistical Institute. Hilbe was > the first editor of the Stata Technical Bulletin, (later named the > Stata Journal) and is author of a number of textbooks, including > Logistic Regression Models and Negative Binomial Regression. > > Written for > Professional/practitioner > --snip-- > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**re: st: R for Stata Users***From:*"Airey, David C" <david.airey@Vanderbilt.Edu>

- Prev by Date:
**Re: st: RE: Display of Shea's partial R^2 in ivreg2 output** - Next by Date:
**RE: st: Repeated time values within panel in levpet STATA output** - Previous by thread:
**Re: st: R for Stata Users** - Next by thread:
**RE: st: R for Stata Users** - Index(es):