Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Keeping a subset of variables |

Date |
Wed, 4 Aug 2010 15:45:59 +0100 |

I think you can do this even without resorting to regular expressions. These wildcards should catch all variables meeting your three rules. I can't see that any loops or even programming is needed. ca?05*9? ca?06*9? ca?07*9? ca?08*9? Nick n.j.cox@durham.ac.uk Marshall Garland I'm attempting to retain a subset of variables from a rather large dataset (>10K variables). The variables have a patterned naming convention, and I'm trying to exploit this pattern to keep only those variables that meet specific criteria. Here's an example of some variables: ca003sr09d cb004sr08d Essentially, I only want to retain those variables that meet the following criteria: 1. The characters in the first two positions must be "ca" 2. The numbers in the 4-5 position must be equal to 05-08 3. The numbers in the substr(var,-2,1) position must be equal to 9 I've tried to adapt code from this thread: http://www.stata.com/statalist/archive/2008-06/msg00301.html And this one: http://www.stata.com/statalist/archive/2007-03/msg01034.html But the number of conditions I'm requiring exceeds the number encountered in these threads, which is where I'm stumbling. The code either chokes (variable whatever cannot be found, which is expected, hence the -cap-) or it is not eliminating the variables that I'm expecting to be dropped, based on the admittedly inelegant syntax I've written. I'm trying to wrap this into a single command, which is perhaps a source of my difficulty. Here's what I've cobbled together thus far, which has a sort of Frankensteinian character since I keep grafting additional loops to address these conditions: //here, i'm retaining just 5-8 grade results for all students foreach var of varlist * { local beg=substr("`var'",6,2) local end=substr("`var'",-1,1) foreach letter in i p b h s e l w m f { foreach num of numlist 3/4 9/11 { cap drop c`letter'00`num'`beg'08`end' cap drop c`letter'0`num'`beg'08`end' cap drop c`letter'00`num'`beg'07`end' cap drop c`letter'0`num'`beg'07`end' } } } Any help from list members would be greatly appreciated. I'm using Stata SE 11.1. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Keeping a subset of variables***From:*Marshall Garland <marshall.w.garland@gmail.com>

- Prev by Date:
**st: Keeping a subset of variables** - Next by Date:
**Re: st: Keeping a subset of variables** - Previous by thread:
**st: Keeping a subset of variables** - Next by thread:
**Re: st: Keeping a subset of variables** - Index(es):