Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Keeping a subset of variables |

Date |
Wed, 4 Aug 2010 15:52:36 +0100 |

No; that won't work. Although Marshall more than once seems to imply otherwise in his posting, I think it's clear that he is talking about selecting variable names, not values of string variables. Nick n.j.cox@durham.ac.uk Richard Goldstein how about the following: keep if substr(var,1,2)=="ca" & real(substr(var,4,2))>=5 /// & real(substr(var,4,2))<=8 & real(substr(var,-2,1))==9 On 8/4/10 10:38 AM, Marshall Garland wrote: > I'm attempting to retain a subset of variables from a rather large > dataset (>10K variables). The variables have a patterned naming > convention, and I'm trying to exploit this pattern to keep only those > variables that meet specific criteria. Here's an example of some > variables: > > ca003sr09d > cb004sr08d > > Essentially, I only want to retain those variables that meet the > following criteria: > > 1. The characters in the first two positions must be "ca" > 2. The numbers in the 4-5 position must be equal to 05-08 > 3. The numbers in the substr(var,-2,1) position must be equal to 9 > > I've tried to adapt code from this thread: > http://www.stata.com/statalist/archive/2008-06/msg00301.html > And this one: > http://www.stata.com/statalist/archive/2007-03/msg01034.html > > But the number of conditions I'm requiring exceeds the number > encountered in these threads, which is where I'm stumbling. The code > either chokes (variable whatever cannot be found, which is expected, > hence the -cap-) or it is not eliminating the variables that I'm > expecting to be dropped, based on the admittedly inelegant syntax I've > written. I'm trying to wrap this into a single command, which is > perhaps a source of my difficulty. Here's what I've cobbled together > thus far, which has a sort of Frankensteinian character since I keep > grafting additional loops to address these conditions: > > //here, i'm retaining just 5-8 grade results for all students > foreach var of varlist * { > local beg=substr("`var'",6,2) > local end=substr("`var'",-1,1) > foreach letter in i p b h s e l w m f { > foreach num of numlist 3/4 9/11 { > cap drop c`letter'00`num'`beg'08`end' > cap drop c`letter'0`num'`beg'08`end' > cap drop c`letter'00`num'`beg'07`end' > cap drop c`letter'0`num'`beg'07`end' > } > } > } > > Any help from list members would be greatly appreciated. > > I'm using Stata SE 11.1. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Keeping a subset of variables***From:*Marshall Garland <marshall.w.garland@gmail.com>

**References**:**st: Keeping a subset of variables***From:*Marshall Garland <marshall.w.garland@gmail.com>

**Re: st: Keeping a subset of variables***From:*Richard Goldstein <richgold@ix.netcom.com>

- Prev by Date:
**st: AW: running sum** - Next by Date:
**st: Graph point estimate and confidence intervals** - Previous by thread:
**AW: st: Keeping a subset of variables** - Next by thread:
**Re: st: Keeping a subset of variables** - Index(es):