Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: R Array [was: Mata for data management]


From   "Scott Merryman" <scott.merryman@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: R Array [was: Mata for data management]
Date   Fri, 1 Feb 2008 09:02:25 -0600

Thanks.

I guess, more concretely, what can be done (or preformed more
efficiently) with an R array that cannot be done with Stata (-foreach-
or -forv-)?

Scott


On Jan 31, 2008 9:44 PM, Gabi Huiber <ghuiber@gmail.com> wrote:
> Oops. Thank you, Scott.
>
> An array is a general data object. It's a a vector when indexed by one
> subscript, a matrix when indexed by two subscripts, or it can be
> indexed by more than two subscripts. It can take numeric and character
> elements. You can think of a numeric array A(i,j,k) as a list of i
> matrices of (j,k) size. The ability to take non-numeric elements is
> useless in statistics, but it's helpful in general data management.
>
> In Stata or SAS we think of data sets as tables with as many columns
> as variables and as many rows as the largest number of non-missing
> observations. This works for statistical analysis. General-purpose
> programming languages (judging by the two I dabble in) seem to want
> you to think of your data in terms of data objects -- scalars,
> vectors, matrices, lists, etc. R is a statistical analysis programming
> environment, but it stayed close to this general-purpose way of
> dealing with data; maybe because its underlying language, S, was
> invented by a computer scientist?
>
> Gabi
>
>
> On Jan 31, 2008 10:14 PM, Scott Merryman <scott.merryman@gmail.com> wrote:
> > On Jan 31, 2008 8:48 PM, Gabi Huiber <ghuiber@gmail.com> wrote:
> > > I'm trying to cheat and speed things up a bit when dealing with a
> > > bunch of files with names such as fileYYYYMMDD.dta. I could collect
> > > the numeric part of the names in a column vector that starts with the
> > > initial values a=J({potential number of files}, 1,0). But there is a
> > > fair chance that my YYYYMMDD succession has gaps, so at the end of the
> > > process this column vector will have some zeroes.
> > >
> > > I would like to do this:
> > >
> > > mata
> > > a=sort(a,1)
> > >
> > > Then drop all the zero elements of a, and end up with a shorter
> > > vector. But I can't find anything like "drop rows" in the Mata book or
> > > Google. Any ideas?
> >
> > -select()- ?
> >
> > mata
> > A = (1,2,3,4,0,5,6,0,0,7)
> > A2 = select(A, A:>0)
> > A2
> > end
> >
> >
> > > Generally, some R-like way to deal with arrays would be nice to have
> > > in Mata or Stata.
> >
> > How do R arrays work?
> >
> > Scott
> >
>
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index