.

Thanks, Paul.

You wrote:

"You will want to make sure that if your subscripts all look like _1, _2, _3... that you only use one of them. Grabbing *_* will give the same stubname many times."

Alas, that's exactly my problem. I need to grab *_*, because I don't know which subscripts are present or absent.

(Since this may sound a little strange, let me explain why this is so. I have a data-set with thousands of variables over a maximum of 40 years or so, in wide format. The problem is that not all variables are present for all years, which is why I prefer to grab everything and get the stubs from there).

Thanks again,

Philipp

E. Paul Wileyto wrote:

I have the following implemented as an ADO. It is what I use to grab all the name stubs for reshaping. You will want to make sure that if your subscripts all look like _1, _2, _3... that you only use one of them. Grabbing *_* will give the same stubname many times.

Paul

program define namelist

***********************************************************************

***********************************************************************

* This short program helps you define a list (global macro vlist)

* of variable name roots to aid in scripting and especially reshaping.

*

* Feel free to use wildcards in varlist. Be sure that you are only

* generating the same root once.

*

* The oldsub option describes the common subscript in the the list.

* The newsub option describes the truncated subscript to be used in

* reshaping.

* * For example: Variables rx30a_s1..rx30a_s20 all have the common root

* rx30a_s, with a numerical subscript for the repetition. To extract a

* list of roots for all survey entries, use:

*

* -namelist rx*_s1, oldsub(_s1) newsub(_s)

*

***********************************************************************

***********************************************************************

syntax varlist, [oldsub(string) newsub(string)]

global vlist "`varlist'"

if "`oldsub'" > "" & "`newsub'" > "" {

global vlist : subinstr local varlist "`oldsub'" "`newsub'" , all

}

noisily ma list vlist

end

pr9@duke.edu wrote:

.

My question is whether there is a clever way to identify the stubnames

for a -reshape-. Consider this example:

sysuse auto, clear

foreach j in 1 2 3 4 5 {

gen x_`j'=rep78+`j'

gen y_`j'=rep78-`j'

}

drop y_3

gen id=_n

keep id x_* y_*

/*

This data-set contains the following variables which I want to -reshape-:

x_1 x_2 x_3 x_4 x_5

y_1 y_2 y_4 y_5

I want to -reshape- by stubnames x_ and y_.

In my "real" data-set, I know that all variables I want to end up with

as stubnames contain an underscore (_). They all also do contain numbers

after the underscore, but there is no regular pattern there.

Now, I am generating a -macrolist- with unique stubnames, but this seems

like a detour to me, especially the loop:

*/

ds *_*

foreach v in `r(varlist)' {

local f `=regexr("`v'","[0-9]+","")'

local n `n' `f'

local n: list uniq n

}

di "*** `n' ****"

reshape long `n', i(id) j(foo)

Two questions:

1) Is there a better way to identify stubnames for a -reshape-?

2) Is there a more straightforward way to arrive at a unique macrolist

than the one I chose? In particular, something like the -regexr()-

function, but with the ability to remove all instances (not just the

first) of the regular expression. Something like -subinstr()-, but for

regular expressions.

Thank you!

Philipp

*

