Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
wgould@stata.com (William Gould, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: RE: st: Converting a SAS datastep to Stata |

Date |
Thu, 16 Dec 2010 16:10:36 -0600 |

I wrote, WG> [...]that is what I would do, probably. With Mata, I can go WG> through the observations one at a time just as SAS does. Daniel Feenberg <feenberg@nber.org> replied, DF> Do you mean a "for" loop over observations? DF> [...] DF> Wouldn't that structure be subject to the complaint you voiced DF> about explicitly looping over observations? [...] If that DF> doesn't apply to Mata (perhaps because Mata is pseudo-compiled) DF> it would be very attractive. The stricture does not apply to Mata. More correctly, I never recommend explicitly looping over observations if you can avoid it, and that applies to Mata, and that applies to language other than Stata and Mata, too, if the language provides an alternative method. In the case of Mata, it is faster than Stata, and explicitly looping over the observations often produces acceptable performance. If you were going to use Mata and explictly loop over observations, I would recommend against using views. In this case, however, I can think of a way to write the procedure without looping over the data: 1. Put the data in year order, so all 1973 are together, all 1974 are together, etc. Do that in Stata. 2. In Mata, construct a view onto the data. 3. Use function [M-5] panelsetup() to obtain the beginning and ending indices of each year. 4. For each value of year, a. Extract from view matrix submatrix for the year using range subscripts [|#,# \ #,#|]; see [M-2] subscripts. Store the result in a regular matrix. b. Pass said matrix to the year-specific Mata subroutine you write to make the calculation. c. In the year-specific subroutine, do not loop through the observations; instead use the appropriate colon operators; see [M-2] op_colon. 5. Now slam in one swoop the newly replaced values of variables back into the View using the same range subscripts [|#,#\#,#|] you used when extracting the the submatrix. This time, the range subscripts will appear to the left of the equal-sign assignment operator. There are other approaches you could use, but what I outlined would be very fast. All of that said, you may very well get adequate performance using Mata and looping over the observations. It is not that what I just suggested would take longer to code than the explicit looping solution, it is merely that it assumes more familiarity with Mata and its advanced features. When breaking into Mata for the first time, it is usually best to stay with approaches with which you are familiar. One of the good features about Stata is that those approaches usually work well. -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: RE: st: Converting a SAS datastep to Stata***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**st:the disadvantage of using -xi- to create indicators for categorical variables** - Next by Date:
**Re: st: RE: Survival analysis - individual survival functions** - Previous by thread:
**Re: RE: st: Converting a SAS datastep to Stata** - Next by thread:
**Re: RE: st: Converting a SAS datastep to Stata** - Index(es):