Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Keeping a subset of variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Keeping a subset of variables
Date   Wed, 4 Aug 2010 15:52:36 +0100

No; that won't work. 

Although Marshall more than once seems to imply otherwise in his
posting, I think it's clear that he is talking about selecting variable
names, not values of string variables. 

Nick 
n.j.cox@durham.ac.uk 

Richard Goldstein

how about the following:

keep if substr(var,1,2)=="ca" & real(substr(var,4,2))>=5 ///
& real(substr(var,4,2))<=8 & real(substr(var,-2,1))==9

On 8/4/10 10:38 AM, Marshall Garland wrote:
 
> I'm attempting to retain a subset of variables from a rather large
> dataset (>10K variables). The variables have a patterned naming
> convention, and I'm trying to exploit this pattern to keep only those
> variables that meet specific criteria. Here's an example of some
> variables:
> 
> ca003sr09d
> cb004sr08d
> 
> Essentially, I only want to retain those variables that meet the
> following criteria:
> 
> 1. The characters in the first two positions must be "ca"
> 2. The numbers in the 4-5 position must be equal to 05-08
> 3. The numbers in the substr(var,-2,1) position must be equal to 9
> 
> I've tried to adapt code from this thread:
> http://www.stata.com/statalist/archive/2008-06/msg00301.html
> And this one:
> http://www.stata.com/statalist/archive/2007-03/msg01034.html
> 
> But the number of conditions I'm requiring exceeds the number
> encountered in these threads, which is where I'm stumbling. The code
> either chokes (variable whatever cannot be found, which is expected,
> hence the -cap-) or it is not eliminating the variables that I'm
> expecting to be dropped, based on the admittedly inelegant syntax I've
> written. I'm trying to wrap this into a single command, which is
> perhaps a source of my difficulty. Here's what I've cobbled together
> thus far, which has a sort of Frankensteinian character since I keep
> grafting additional loops to address these conditions:
> 
> //here, i'm retaining just 5-8 grade results for all students
> foreach var of varlist * {
> 		local beg=substr("`var'",6,2)
> 		local end=substr("`var'",-1,1)
> 	foreach letter in i p b h s e l w m f {
> 		foreach num of numlist 3/4 9/11 {
> 			cap drop c`letter'00`num'`beg'08`end'
> 			cap drop c`letter'0`num'`beg'08`end'
> 			cap drop c`letter'00`num'`beg'07`end'
> 			cap drop c`letter'0`num'`beg'07`end'
> 		}
> 	}
> }
> 
> Any help from list members would be greatly appreciated.
> 
> I'm using Stata SE 11.1.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index