Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Can you confirm these Stata limitations?

From   Nick Cox <>
To   "''" <>
Subject   st: RE: Can you confirm these Stata limitations?
Date   Sun, 5 Sep 2010 19:40:47 +0100

I won't comment on speed of Mac graphs, as many know more about that than I do. 

Q1. A common step early in our analysis is to collapse single-trial data into mean values per condition. Is it indeed true that the collapse command can only handle 8 variables at a time? This means that instead of being able to create a single table with all the variables' means, I'd have to create multiple tables, 8 variables at a time, and then join them.

NJC >>> Absolutely not. I don't think there is any upper limit. If there is, it is enormously more than 8. 

Q2. We often calculate derived variables based on a formula. However, the formula can differ slightly for different groups or conditions. Can a complex "if" statement be used at the end of a formula, or is there the equivalent of a "case..." command that could be part of a formula. For example:

gen int outcome = [ 1 if (peakVel > 10 & condition == "easy"); 2 if (peakVel > 50 & condition == "difficult") ]

NJC >>> Various ways to do it. Here's one: 

gen int outcome = (peakVel > 10 & condition == "easy") + 2 * (peakVel > 50 & condition == "difficult") 

Q3. Is it true that there is no option for error bars in a continuous vs. categorical plot, i.e. it is necessary to convert the category into a numerical variable in order to plot error bars? I found the command serrbar, which removes some of the steps required in the usual workarounds (calculating error value is enough; high and low values are no longer explicitly needed), but it only works with a continuous-variable x axis. This is a minor inconvenience (the much more lamented inconvenience, for users of JMP, is not being able to have the graph show standard error on a mean plot without having to explicitly calculate it).

NJC >>> -serrbar-'s just an old command still hanging around from early days. You can plot error bars how you like using -twoway-. There are several alternative commands. By the way, Stata doesn't really distinguish categorical and continuous variables, except for factor variable notation, which came in Stata 11. 

Q4. Is there any way to create a graph that shows individual values overlaid on top of mean values? For example, one that has the following two elements on a single graph, for variables vel, subj, group (one value of vel for each subj; several subj per group; several groups):

NJC >>> Sounds easy enough. More than one way to do it. 


Pietro Mazzoni

It's my first time writing to this list, which is a very helpful forum. I can use some help in clarifying certain features of Stata that seem to be limitations, at least in the style of analysis I would like to implement. I am deciding whether to have my whole lab switch from JMP to Stata for our data analysis due to the high cost of JMP. I will avoid the debate about virtues and faults of an interactive, GUI-based program like JMP. I see the value and power of Stata's approach, and cost considerations may be forcing us to switch regardless. But the following limitations are making it difficult to convince my postdocs (and, frankly, me), that using Stata will not, even after a transition period, take orders of magnitude more time for visualization of our data. I apologize if these questions are elementary. I read through "A Gentle Introduction to Stata" and searched the Web, but could not find these answers.

I am using Stata 10 for Mac. The data is usually a table of movement variables (duration, peak velocity, etc) for each of several trials in motor control experiments. There are several trials per condition, several conditions per session, a few sessions per subject, and several subjects per group. We usually work with two main tables: the one just described (to look at trial-to-trial time course of variables within conditions) and a calculated table that has the mean values of all these variables for each condition.

5. The time it takes for Stata 10 to generate a graph on a Mac, even on a late-model machine, is inordinately slow (several seconds) and grows quickly (to tens of seconds) with the number of elements that are added to the graph. Does anyone know if Stata 11 is faster at drawing graphs on a Mac?

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index