Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Column vector into variable, accounting for -marksample-

From   Nick Cox <>
Subject   Re: st: Column vector into variable, accounting for -marksample-
Date   Tue, 9 Oct 2012 09:37:39 +0100

Good. In fact, the device of creating an observation number variable
is unnecessary:

marksample `touse'
replace `touse' = -`touse'
sort `touse', stable

Note that logical negation is not exactly equivalent. Although

marksample `touse'
replace `touse' = !`touse'
sort `touse', stable

produces the same -sort- of pertinent observations to the first so
many in the data, you wuold have to remember to negate all your
logical selections -- unless you could reverse the negation after
-sort-ing, however.

On Tue, Oct 9, 2012 at 4:10 AM, Lacy,Michael <> wrote:
> Nick Cox wrote:
>>I would use Mata in most such cases, but for this problem in Stata I would go
>>tempvar obsno
>>gen `obsno' = _n
>>marksample `touse'
>>replace `touse' = -`touse'
>>sort `touse' `obsno'
>>Then the first so many observations in the dataset are also the first
>>so many being used in the program and so those to receive the values
>>of the vector.
>>That is, `touse' is created 0 and 1 but if negated is 0 and -1; -sort
>>`touse'- then puts the observations being used first. Negation makes
>>no difference to the consequence of doing things -if `touse'-.
> I tried a a quick mock-up that showed Nick's approach as 10-20 times faster than
> what I describe below.  Even if what I described were implemented in Mata, his
> approach would still be faster, not to mention simpler and less error prone.
> Regards,
> Mike Lacy
> Dept. of Sociology
> Colorado State University
> Fort Collins CO 80523-1784
> ================================================================
>>On Sun, Oct 7, 2012 at 8:45 PM, Lacy,Michael <> wrote:
>>> I have a Stata program that obtains a column vector, say r(X) by calling another program. I want to put the
>>>  values of r(X) into a variable in the original dataset, which for me is complicated by -marksample-.
>>> The values of r(X) arrive in the same order as the original data would be if cases not selected by
>>> -marksample- were dropped. So, if were there no use of marksample, the row index of r(X) would
>>> correspond to _n, and there would be no difficulty; -svmat- will align the contents of r(X) as desired.
>>> But this is not true because the data are processed -if `touse'-, so that the first element of r(X) might
>>> correspond to the 4th observation of the original data set, the second element to the 7th observation,
>>> and so forth.
>>> My thinking so far involves reconstructing the r(X) matrix into another matrix, putting in missing values
>>> as indicated by -touse-, per my fragment below.  Is there a less lunky way to accomplish this?
>>> (My program might be called many times in a simulation context, so time is an issue.)
>>> This must be an utterly commonly task.  (And yes, I could do the same in Mata, but it's still more
>>> or less the same thing.)
>>> program whatever
>>> syntax ...[if] ...
>>> marksample touse
>>> ....Other program is called with if `touse', which returns r(X)
>>> // Make r(X) into a variable
>>> X = return(X)
>>> mkmat `touse', matrix(tu)
>>> mat `temp' = J(`=_N',1,.) // working matrix for repacking X
>>> local pos = 1
>>> forval i = 1/`=_N' {
>>>    if (tu[`i',1] == 1) {
>>>       mat `temp'[`i',1] = `D'[`pos',1]
>>>         local ++pos
>>>    }
>>> }
>>> svmat `temp', name(Xvar)
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index