Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Column vector into variable, accounting for -marksample- |
Date | Tue, 9 Oct 2012 09:37:39 +0100 |
Good. In fact, the device of creating an observation number variable is unnecessary: marksample `touse' replace `touse' = -`touse' sort `touse', stable Note that logical negation is not exactly equivalent. Although marksample `touse' replace `touse' = !`touse' sort `touse', stable produces the same -sort- of pertinent observations to the first so many in the data, you wuold have to remember to negate all your logical selections -- unless you could reverse the negation after -sort-ing, however. On Tue, Oct 9, 2012 at 4:10 AM, Lacy,Michael <Michael.Lacy@colostate.edu> wrote: > Nick Cox wrote: >> >>I would use Mata in most such cases, but for this problem in Stata I would go >> >>tempvar obsno >>gen `obsno' = _n >>marksample `touse' >>replace `touse' = -`touse' >>sort `touse' `obsno' >> >>Then the first so many observations in the dataset are also the first >>so many being used in the program and so those to receive the values >>of the vector. >> >>That is, `touse' is created 0 and 1 but if negated is 0 and -1; -sort >>`touse'- then puts the observations being used first. Negation makes >>no difference to the consequence of doing things -if `touse'-. >> >>Nick > > > I tried a a quick mock-up that showed Nick's approach as 10-20 times faster than > what I describe below. Even if what I described were implemented in Mata, his > approach would still be faster, not to mention simpler and less error prone. > > Regards, > > Mike Lacy > Dept. of Sociology > Colorado State University > Fort Collins CO 80523-1784 > > > > > ================================================================ > >> >>On Sun, Oct 7, 2012 at 8:45 PM, Lacy,Michael <Michael.Lacy@colostate.edu> wrote: >>> I have a Stata program that obtains a column vector, say r(X) by calling another program. I want to put the >>> values of r(X) into a variable in the original dataset, which for me is complicated by -marksample-. >>> >>> The values of r(X) arrive in the same order as the original data would be if cases not selected by >>> -marksample- were dropped. So, if were there no use of marksample, the row index of r(X) would >>> correspond to _n, and there would be no difficulty; -svmat- will align the contents of r(X) as desired. >>> But this is not true because the data are processed -if `touse'-, so that the first element of r(X) might >>> correspond to the 4th observation of the original data set, the second element to the 7th observation, >>> and so forth. >>> >>> My thinking so far involves reconstructing the r(X) matrix into another matrix, putting in missing values >>> as indicated by -touse-, per my fragment below. Is there a less lunky way to accomplish this? >>> (My program might be called many times in a simulation context, so time is an issue.) >>> >>> This must be an utterly commonly task. (And yes, I could do the same in Mata, but it's still more >>> or less the same thing.) >>> >>> program whatever >>> syntax ...[if] ... >>> marksample touse >>> >>> ....Other program is called with if `touse', which returns r(X) >>> // Make r(X) into a variable >>> X = return(X) >>> mkmat `touse', matrix(tu) >>> mat `temp' = J(`=_N',1,.) // working matrix for repacking X >>> local pos = 1 >>> forval i = 1/`=_N' { >>> if (tu[`i',1] == 1) { >>> mat `temp'[`i',1] = `D'[`pos',1] >>> local ++pos >>> } >>> } >>> svmat `temp', name(Xvar) >>> >>* * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/