Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Selecting variables corresponds to observation numbers


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Selecting variables corresponds to observation numbers
Date   Tue, 8 Mar 2011 11:12:07 +0000

Quite so, but with glosses.

1. There is indeed a big difference between evaluating  a macro using
= and not evaluating it. And it bites. Although this is documented in
the manual, people often miss it or forget it.

SJ-8-4  pr0045  . . . . . . . . Stata tip 70: Beware the evaluating equal sign
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q4/08   SJ 8(4):586--587                                 (no commands)
        tip explaining the pitfall of losing content in a macro
        because of limits on the length of string expressions

In essence, a macro has to be able to big enough to hold the entire
varlist. Otherwise, Stata could not function. See -help limits- for
the limits.

2. On emptiness, discussion can approach existentialist complexity and
obscurity. I'd say rather that a reference to a macro that does not
exist is interpreted as a reference to an empty string.

A reference like

... `gordon' ...

does not create a macro, even an empty one, if it does not previously
exist. Indeed even explicitly creating a macro that is empty will
fail. The test is that you go

local gordon
mac li

No such macro is listed, even as a empty string. However, on occasion

local gordon

is useful as something that empties an _existing_ macro and thus lets
you start again.

3. I included some material on macros for beginners in

SJ-2-2  pr0005  . . . . . .  Speaking Stata:  How to face lists with fortitude
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/02   SJ 2(2):202--222                                 (no commands)
        demonstrates the usefulness of for, foreach, forvalues, and
        local macros for interactive (non programming) tasks

4. Here's a more challenging example. Given a prior

local J = ...

what is the difference between

(1)

forval j = 1/`J' {
        tempvar newvar
        gen `newvar' = ...
        local newvars `newvars' `newvar'
}

and

(2)

forval j = 1/`J' {
        tempvar newvar`j'
        gen `newvar`j'' = ...
        local newvars `newvars' `newvar`j''
}

Answer: none in terms of the result. The first may look like a bug,
but second time, third time ... around the loop -tempvar- is
guaranteed to come up with a new name not already in use, so long as
you keep track of what that new name is, you're OK.

If (2) looks too tricky, don't use it. It's more for the amusement of
the programmer.

Nick

On Tue, Mar 8, 2011 at 10:50 AM, Gordon Hughes <[email protected]> wrote:
> For me, the crucial lesson that I had forgotten was how easy it is to extend
> a macro variable list by appending an additional variable.
>
> But I was initially surprised that the method worked in my case since the
> contents of varlist extends to ~500 variables, while the Section 18.3.4 of
> the User Guide says that the expression handler is limited to 244
> characters.  The difference between extending a macro variable (by adding
> something at the end) and evaluating the macro variable is critical.
>  Similarly, I did not know that referring to a macro that does not exist -
> i.e. `varlist' in the first execution of -local varlist `varlist' var`j' -
> in effect creates an empty macro of that name.
>
> This is what I meant in referring to the subtle details of Stata's macro
> language which are hard to fathom simply by reading the manuals.  Perhaps
> Kit Baum might like to consider a fuller treatment of macros in a future
> edition of An Introduction to Stata Programming.

>> Date: Mon, 7 Mar 2011 17:59:11 +0000
>> From: Nick Cox <[email protected]>
>> Subject: Re: st: Selecting variables corresponds to observation numbers
>>
>> I agree, but let's make it easier. There were really two things to
>> learn in my example, the idea of a local macro and the idea of a loop:
>>
>> local N = _N
>> forval i = 1/`N' {
>>         local j = ind[`i']
>>         local varlist `varlist' var`j'
>> }
>>
>> keep `varlist'
>>
>> Using `= ' to cut down on the code was at best a stylistic detail and
>> at worst something that might have obscured the code for learners.
>>
>> Nick
>>
>> On Mon, Mar 7, 2011 at 5:50 PM, Gordon Hughes <[email protected]> wrote:
>>
>> > Thank you, Nick.  This is a really nice solution that does the task much
>> > more efficiently than the best method I had come up (using index vectors
>> > in
>> > Mata).
>> >
>> > There is [sigh] so much to learn about how to get the best out of
>> > Stata's
>> > macro language.
>> >
>> > Gordon Hughes
>> > [email protected]
>> >
>> >> ------------------------------
>> >>
>> >> Date: Mon, 7 Mar 2011 00:05:36 +0000
>> >> From: Nick Cox <[email protected]>
>> >> Subject: Re: st: Selecting variables corresponds to observation numbers
>> >>
>> >> forval i = 1/`=_N' {
>> >>         local varlist `varlist' var`=ind[`i']'
>> >> }
>> >>
>> >> keep `varlist'
>> >>
>> >> Nick
>> >>
>> >> On Sun, Mar 6, 2011 at 10:38 PM, Gordon Hughes <[email protected]>
>> >> wrote:
>> >> > I would be very grateful if someone could suggest an efficient way of
>> >> > implementing the following task.
>> >> >
>> >> > I have a dataset with M observations and N variables where N >> M and
>> >> > the
>> >> > variables are named var1-varN.  In addition, I have an index variable
>> >> > ind
>> >> > which takes M unique values in the range 1..N.  I want to select the
>> >> > variables that correspond to the M index values.  For example, the
>> >> > data
>> >> > might be
>> >> >
>> >> > ind  var1  var2  var3  var4  var5
>> >> > 2     1      2      3      4      5
>> >> > 4     6      7      8      9      10
>> >> >
>> >> > so I want to create a dataset consisting of the following
>> >> > observations &
>> >> > variables
>> >> >
>> >> > ind  var2  var4
>> >> > 2     2      4
>> >> > 4     7      9
>> >> >
>> >> > However rather than 5 variables and 2 observations I have more like
>> >> > 5000
>> >> > variables and 500 observations.  I can do this using reshape or xpose
>> >> > and
>> >> > merging files, but this is very slow when N > 5000 and I want to
>> >> > repeat
>> >> > the
>> >> > exercise many times.  Another alternative is to use permutation
>> >> > matrices
>> >> > in
>> >> > Mata, since what I am trying to do is equivalent to shuffling rows
>> >> > and
>> >> > columns in some rather large matrices.  Still, I feel that there
>> >> > should
>> >> > be a
>> >> > cleverer way of doing it by manipulating varlists in Stata but I
>> >> > haven't
>> >> > come up with a solution.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index