I am currently working on a collection of Mata functions, which
are the base of a rather extensive data analysis procedure,
involving a lot of nested looping and simulation with a lot of
reps (1000+). All in all, this requires my Mata code to be as
efficient as possible.
I have a recurring problem: how to easily select/copy rows (or
cols) of a large matrix, which match a specific criteria. It is
easy enough to tag the row you need, using a logical expression.
A very simplified example:
: x = (1, ., 0, ., 1, 0)'
: (x , x :!= .)
1 2
+---------+
1 | 1 1 |
2 | . 0 |
3 | 0 1 |
4 | . 0 |
5 | 1 1 |
6 | 0 1 |
+---------+
But what is the most efficient way of getting:
1 1
+-----+ +-----+
1 | 1 | 1 | 0 |
2 | 0 | or 2 | 0 |
3 | 1 | 3 | 1 |
4 | 0 | 4 | 1 |
+-----+ +-----+
I suppose you could:
(a) loop through the rows, selecting the relevant ones
(b) call Stata, e.g. -keep if !missing(var1)-
(c) mark out the rows you don't want (with missing), sort and
select the non-missing rows of the matrix.
Which would you choose? Do you have a suggestion for a more
efficient solution?
Right now I mostly use (c), when the problem comes up, because
sorting doesn't break my data. For an example,
view http://www.nat.sdu.dk/users/nat-sdu/jkha01/data/test20060404.txt
or
do http://www.nat.sdu.dk/users/nat-sdu/jkha01/data/test20060404.do
But what I would really like to be able to do, is to use a tag as
subscript to a matrix, in order to select rows or cols, which are
to be used, e.g.:
: tag = x :!= .
: x[<tag>]
1
+-----+
1 | 1 |
2 | 0 |
3 | 1 |
4 | 0 |
+-----+
I know this is supported in some matrix programming languages,
and I also know that it is not supported in Mata.
/Jesper
Kind regards,
Jesper Kjær Hansen
Student Assistant
Department of Statistics
University of Southern Denmark
mailto:kjaer.hansen@oncable.dk
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/