Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Column vector into variable, accounting for -marksample-

From   "Lacy,Michael" <>
To   "" <>
Subject   Re: st: Column vector into variable, accounting for -marksample-
Date   Tue, 9 Oct 2012 03:10:45 +0000

Nick Cox wrote:
>I would use Mata in most such cases, but for this problem in Stata I would go
>tempvar obsno
>gen `obsno' = _n
>marksample `touse'
>replace `touse' = -`touse'
>sort `touse' `obsno'
>Then the first so many observations in the dataset are also the first
>so many being used in the program and so those to receive the values
>of the vector.
>That is, `touse' is created 0 and 1 but if negated is 0 and -1; -sort
>`touse'- then puts the observations being used first. Negation makes
>no difference to the consequence of doing things -if `touse'-.

I tried a a quick mock-up that showed Nick's approach as 10-20 times faster than 
what I describe below.  Even if what I described were implemented in Mata, his
approach would still be faster, not to mention simpler and less error prone.


Mike Lacy
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784


>On Sun, Oct 7, 2012 at 8:45 PM, Lacy,Michael <> wrote:
>> I have a Stata program that obtains a column vector, say r(X) by calling another program. I want to put the
>>  values of r(X) into a variable in the original dataset, which for me is complicated by -marksample-.
>> The values of r(X) arrive in the same order as the original data would be if cases not selected by
>> -marksample- were dropped. So, if were there no use of marksample, the row index of r(X) would
>> correspond to _n, and there would be no difficulty; -svmat- will align the contents of r(X) as desired.
>> But this is not true because the data are processed -if `touse'-, so that the first element of r(X) might
>> correspond to the 4th observation of the original data set, the second element to the 7th observation,
>> and so forth.
>> My thinking so far involves reconstructing the r(X) matrix into another matrix, putting in missing values
>> as indicated by -touse-, per my fragment below.  Is there a less lunky way to accomplish this?
>> (My program might be called many times in a simulation context, so time is an issue.)
>> This must be an utterly commonly task.  (And yes, I could do the same in Mata, but it's still more
>> or less the same thing.)
>> program whatever
>> syntax ...[if] ...
>> marksample touse
>> ....Other program is called with if `touse', which returns r(X)
>> // Make r(X) into a variable
>> X = return(X)
>> mkmat `touse', matrix(tu)
>> mat `temp' = J(`=_N',1,.) // working matrix for repacking X
>> local pos = 1
>> forval i = 1/`=_N' {
>>    if (tu[`i',1] == 1) {
>>       mat `temp'[`i',1] = `D'[`pos',1]
>>         local ++pos
>>    }
>> }
>> svmat `temp', name(Xvar)

Mike Lacy
Assoc. Prof./Dir. Grad. Studies
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784
970.491.6721 (voice)

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index