Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Selecting variables corresponds to observation numbers
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Selecting variables corresponds to observation numbers
Date
Tue, 8 Mar 2011 11:12:07 +0000
Quite so, but with glosses.
1. There is indeed a big difference between evaluating a macro using
= and not evaluating it. And it bites. Although this is documented in
the manual, people often miss it or forget it.
SJ-8-4 pr0045 . . . . . . . . Stata tip 70: Beware the evaluating equal sign
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/08 SJ 8(4):586--587 (no commands)
tip explaining the pitfall of losing content in a macro
because of limits on the length of string expressions
In essence, a macro has to be able to big enough to hold the entire
varlist. Otherwise, Stata could not function. See -help limits- for
the limits.
2. On emptiness, discussion can approach existentialist complexity and
obscurity. I'd say rather that a reference to a macro that does not
exist is interpreted as a reference to an empty string.
A reference like
... `gordon' ...
does not create a macro, even an empty one, if it does not previously
exist. Indeed even explicitly creating a macro that is empty will
fail. The test is that you go
local gordon
mac li
No such macro is listed, even as a empty string. However, on occasion
local gordon
is useful as something that empties an _existing_ macro and thus lets
you start again.
3. I included some material on macros for beginners in
SJ-2-2 pr0005 . . . . . . Speaking Stata: How to face lists with fortitude
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q2/02 SJ 2(2):202--222 (no commands)
demonstrates the usefulness of for, foreach, forvalues, and
local macros for interactive (non programming) tasks
4. Here's a more challenging example. Given a prior
local J = ...
what is the difference between
(1)
forval j = 1/`J' {
tempvar newvar
gen `newvar' = ...
local newvars `newvars' `newvar'
}
and
(2)
forval j = 1/`J' {
tempvar newvar`j'
gen `newvar`j'' = ...
local newvars `newvars' `newvar`j''
}
Answer: none in terms of the result. The first may look like a bug,
but second time, third time ... around the loop -tempvar- is
guaranteed to come up with a new name not already in use, so long as
you keep track of what that new name is, you're OK.
If (2) looks too tricky, don't use it. It's more for the amusement of
the programmer.
Nick
On Tue, Mar 8, 2011 at 10:50 AM, Gordon Hughes <[email protected]> wrote:
> For me, the crucial lesson that I had forgotten was how easy it is to extend
> a macro variable list by appending an additional variable.
>
> But I was initially surprised that the method worked in my case since the
> contents of varlist extends to ~500 variables, while the Section 18.3.4 of
> the User Guide says that the expression handler is limited to 244
> characters. The difference between extending a macro variable (by adding
> something at the end) and evaluating the macro variable is critical.
> Similarly, I did not know that referring to a macro that does not exist -
> i.e. `varlist' in the first execution of -local varlist `varlist' var`j' -
> in effect creates an empty macro of that name.
>
> This is what I meant in referring to the subtle details of Stata's macro
> language which are hard to fathom simply by reading the manuals. Perhaps
> Kit Baum might like to consider a fuller treatment of macros in a future
> edition of An Introduction to Stata Programming.
>> Date: Mon, 7 Mar 2011 17:59:11 +0000
>> From: Nick Cox <[email protected]>
>> Subject: Re: st: Selecting variables corresponds to observation numbers
>>
>> I agree, but let's make it easier. There were really two things to
>> learn in my example, the idea of a local macro and the idea of a loop:
>>
>> local N = _N
>> forval i = 1/`N' {
>> local j = ind[`i']
>> local varlist `varlist' var`j'
>> }
>>
>> keep `varlist'
>>
>> Using `= ' to cut down on the code was at best a stylistic detail and
>> at worst something that might have obscured the code for learners.
>>
>> Nick
>>
>> On Mon, Mar 7, 2011 at 5:50 PM, Gordon Hughes <[email protected]> wrote:
>>
>> > Thank you, Nick. This is a really nice solution that does the task much
>> > more efficiently than the best method I had come up (using index vectors
>> > in
>> > Mata).
>> >
>> > There is [sigh] so much to learn about how to get the best out of
>> > Stata's
>> > macro language.
>> >
>> > Gordon Hughes
>> > [email protected]
>> >
>> >> ------------------------------
>> >>
>> >> Date: Mon, 7 Mar 2011 00:05:36 +0000
>> >> From: Nick Cox <[email protected]>
>> >> Subject: Re: st: Selecting variables corresponds to observation numbers
>> >>
>> >> forval i = 1/`=_N' {
>> >> local varlist `varlist' var`=ind[`i']'
>> >> }
>> >>
>> >> keep `varlist'
>> >>
>> >> Nick
>> >>
>> >> On Sun, Mar 6, 2011 at 10:38 PM, Gordon Hughes <[email protected]>
>> >> wrote:
>> >> > I would be very grateful if someone could suggest an efficient way of
>> >> > implementing the following task.
>> >> >
>> >> > I have a dataset with M observations and N variables where N >> M and
>> >> > the
>> >> > variables are named var1-varN. In addition, I have an index variable
>> >> > ind
>> >> > which takes M unique values in the range 1..N. I want to select the
>> >> > variables that correspond to the M index values. For example, the
>> >> > data
>> >> > might be
>> >> >
>> >> > ind var1 var2 var3 var4 var5
>> >> > 2 1 2 3 4 5
>> >> > 4 6 7 8 9 10
>> >> >
>> >> > so I want to create a dataset consisting of the following
>> >> > observations &
>> >> > variables
>> >> >
>> >> > ind var2 var4
>> >> > 2 2 4
>> >> > 4 7 9
>> >> >
>> >> > However rather than 5 variables and 2 observations I have more like
>> >> > 5000
>> >> > variables and 500 observations. I can do this using reshape or xpose
>> >> > and
>> >> > merging files, but this is very slow when N > 5000 and I want to
>> >> > repeat
>> >> > the
>> >> > exercise many times. Another alternative is to use permutation
>> >> > matrices
>> >> > in
>> >> > Mata, since what I am trying to do is equivalent to shuffling rows
>> >> > and
>> >> > columns in some rather large matrices. Still, I feel that there
>> >> > should
>> >> > be a
>> >> > cleverer way of doing it by manipulating varlists in Stata but I
>> >> > haven't
>> >> > come up with a solution.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/