Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Use of local macros when generating new variables (-rowranks-)


From   Michael Crain <[email protected]>
To   [email protected]
Subject   Re: st: Use of local macros when generating new variables (-rowranks-)
Date   Thu, 23 May 2013 12:47:42 -0700 (PDT)

Many thanks for your message. This does not exactly take care of my problem.

My dataset has over 200 variables for each year rather than four in my simple example. And the set of variables changes somewhat from year to year as several variables come and go. That is why I wrote the loop (below) with the local macro for the codestring variable. 

I get the idea when relatively few variables are being ranked. But I am trying to figure out an efficient way to write the syntax with over 200 variables for each time period. 

Any suggestions? Thank you.


>>> reply from Nick Cox

-rowranks- is from SJ 9-1.

It sounds as if you want something like

forval j = 1/3 {
      rowranks Cabc_`j' Cdef_`j' C456_`j' C789_`j', gen(R1_`j'-R4_`j')
}

The trick here is: how many repetitions are there going to be, and
what varies between repetitions?

The answer is 3, and the last character of each variable name varies,
so that's your loop.

Nick
[email protected]


On 23 May 2013 19:27, Michael Crain <[email protected]> wrote:
> Using Nick Cox's -rowranks- ado that ranks values across rows, the
> general syntax is:
> rowranks x1-x5, generate(r1-r5)
>
> My very large dataset consists of subsets of rows (variables) that I
> want to rank. I need some help on the syntax (or approach).
>
> My variables have this general form: C[code string]_[time period]
>
> I want to rank across the C[code string] variables by _year_. For
> instance, my variables look like this:
>
> Cabc_1
> Cdef_1
> C456_1
> C789_1
> Cabc_2
> Cdef_2
> C456_2
> C789_2
> Cabc_3
> Cdef_3
> C456_3
> C789_3
>
> I want to rank across Cabc Cdef C456 C789 for each time period so:
> a. rank across Cabc_1 Cdef_1 C456_1 C789_1, then
> b. rank across Cabc_2 Cdef_2 C456_2 C789_2, then
> c. rank across Cabc_3 Cdef_3 C456_3 C789_3
>
> I tried using a loop but my problem is how to write the -rowranks-
> syntax for the subset variable range for the source variables and new
> rank variables.
>
> My loop is this:
>
> levelsof timedummy, local(timeperiod)
> levelsof codestring, local(code)
> foreach t of local timeperiod {
>    foreach c of local code {
>       rowranks C`c'_`t', gen(R`c'_`t')
>       }
>    }
>
> With this code, Stata sees my source variables (over 200 of them in year time period), but sees only one new variable name (an error). I tried variations of this syntax for the variables with no success,
>
> Can anyone help with the syntax? Or is there is a better approach?

--- On Thu, 5/23/13, Michael Crain <[email protected]> wrote:

> From: Michael Crain <[email protected]>
> Subject: Use of local macros when generating new variables (-rowranks-)
> To: [email protected]
> Date: Thursday, May 23, 2013, 2:27 PM
> Using Nick Cox's -rowranks- ado that
> ranks values across rows, the
> general syntax is:
> rowranks x1-x5, generate(r1-r5)
> 
> My very large dataset consists of subsets of rows
> (variables) that I
> want to rank. I need some help on the syntax (or approach).
> 
> My variables have this general form: C[code string]_[time
> period]
> 
> I want to rank across the C[code string] variables by
> _year_. For
> instance, my variables look like this:
> 
> Cabc_1
> Cdef_1
> C456_1
> C789_1
> Cabc_2
> Cdef_2
> C456_2
> C789_2
> Cabc_3
> Cdef_3
> C456_3
> C789_3
> 
> I want to rank across Cabc Cdef C456 C789 for each time
> period so:
> a. rank across Cabc_1 Cdef_1 C456_1 C789_1, then
> b. rank across Cabc_2 Cdef_2 C456_2 C789_2, then 
> c. rank across Cabc_3 Cdef_3 C456_3 C789_3
> 
> I tried using a loop but my problem is how to write the
> -rowranks-
> syntax for the subset variable range for the source
> variables and new
> rank variables.
> 
> My loop is this:
> 
> levelsof timedummy, local(timeperiod)
> levelsof codestring, local(code)
> foreach t of local timeperiod {
>    foreach c of local code {
>       rowranks C`c'_`t', gen(R`c'_`t')
>       }
>    }
> 
> With this code, Stata sees my source variables (over 200 of
> them in year time period), but sees only one new variable
> name (an error). I tried variations of this syntax for the
> variables with no success,
> 
> Can anyone help with the syntax? Or is there is a better
> approach?
> 
>


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index