Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: Column vector into variable, accounting for -marksample-

 From "Lacy,Michael" To "statalist@hsphsun2.harvard.edu" Subject Re: st: Column vector into variable, accounting for -marksample- Date Tue, 9 Oct 2012 03:10:45 +0000

```Nick Cox wrote:
>
>I would use Mata in most such cases, but for this problem in Stata I would go
>
>tempvar obsno
>gen `obsno' = _n
>marksample `touse'
>replace `touse' = -`touse'
>sort `touse' `obsno'
>
>Then the first so many observations in the dataset are also the first
>so many being used in the program and so those to receive the values
>of the vector.
>
>That is, `touse' is created 0 and 1 but if negated is 0 and -1; -sort
>`touse'- then puts the observations being used first. Negation makes
>no difference to the consequence of doing things -if `touse'-.
>
>Nick

I tried a a quick mock-up that showed Nick's approach as 10-20 times faster than
what I describe below.  Even if what I described were implemented in Mata, his
approach would still be faster, not to mention simpler and less error prone.

Regards,

Mike Lacy
Dept. of Sociology
Fort Collins CO 80523-1784

================================================================

>
>On Sun, Oct 7, 2012 at 8:45 PM, Lacy,Michael <Michael.Lacy@colostate.edu> wrote:
>> I have a Stata program that obtains a column vector, say r(X) by calling another program. I want to put the
>>  values of r(X) into a variable in the original dataset, which for me is complicated by -marksample-.
>>
>> The values of r(X) arrive in the same order as the original data would be if cases not selected by
>> -marksample- were dropped. So, if were there no use of marksample, the row index of r(X) would
>> correspond to _n, and there would be no difficulty; -svmat- will align the contents of r(X) as desired.
>> But this is not true because the data are processed -if `touse'-, so that the first element of r(X) might
>> correspond to the 4th observation of the original data set, the second element to the 7th observation,
>> and so forth.
>>
>> My thinking so far involves reconstructing the r(X) matrix into another matrix, putting in missing values
>> as indicated by -touse-, per my fragment below.  Is there a less lunky way to accomplish this?
>> (My program might be called many times in a simulation context, so time is an issue.)
>>
>> This must be an utterly commonly task.  (And yes, I could do the same in Mata, but it's still more
>> or less the same thing.)
>>
>> program whatever
>> syntax ...[if] ...
>> marksample touse
>>
>> ....Other program is called with if `touse', which returns r(X)
>> // Make r(X) into a variable
>> X = return(X)
>> mkmat `touse', matrix(tu)
>> mat `temp' = J(`=_N',1,.) // working matrix for repacking X
>> local pos = 1
>> forval i = 1/`=_N' {
>>    if (tu[`i',1] == 1) {
>>       mat `temp'[`i',1] = `D'[`pos',1]
>>         local ++pos
>>    }
>> }
>> svmat `temp', name(Xvar)
>>
>*

Mike Lacy
Dept. of Sociology