Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Keeping a subset of variables


From   Marshall Garland <marshall.w.garland@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Keeping a subset of variables
Date   Wed, 4 Aug 2010 10:22:38 -0500

Hello-

Apologies for the confusion. Nick's right: I'm selecting variable names.

Nick's parsimonious solution using wildcards works perfectly.

Thanks so much your responses.

Cheers,

-mwg

On Wed, Aug 4, 2010 at 9:52 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> No; that won't work.
>
> Although Marshall more than once seems to imply otherwise in his
> posting, I think it's clear that he is talking about selecting variable
> names, not values of string variables.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Richard Goldstein
>
> how about the following:
>
> keep if substr(var,1,2)=="ca" & real(substr(var,4,2))>=5 ///
> & real(substr(var,4,2))<=8 & real(substr(var,-2,1))==9
>
> On 8/4/10 10:38 AM, Marshall Garland wrote:
>
>> I'm attempting to retain a subset of variables from a rather large
>> dataset (>10K variables). The variables have a patterned naming
>> convention, and I'm trying to exploit this pattern to keep only those
>> variables that meet specific criteria. Here's an example of some
>> variables:
>>
>> ca003sr09d
>> cb004sr08d
>>
>> Essentially, I only want to retain those variables that meet the
>> following criteria:
>>
>> 1. The characters in the first two positions must be "ca"
>> 2. The numbers in the 4-5 position must be equal to 05-08
>> 3. The numbers in the substr(var,-2,1) position must be equal to 9
>>
>> I've tried to adapt code from this thread:
>> http://www.stata.com/statalist/archive/2008-06/msg00301.html
>> And this one:
>> http://www.stata.com/statalist/archive/2007-03/msg01034.html
>>
>> But the number of conditions I'm requiring exceeds the number
>> encountered in these threads, which is where I'm stumbling. The code
>> either chokes (variable whatever cannot be found, which is expected,
>> hence the -cap-) or it is not eliminating the variables that I'm
>> expecting to be dropped, based on the admittedly inelegant syntax I've
>> written. I'm trying to wrap this into a single command, which is
>> perhaps a source of my difficulty. Here's what I've cobbled together
>> thus far, which has a sort of Frankensteinian character since I keep
>> grafting additional loops to address these conditions:
>>
>> //here, i'm retaining just 5-8 grade results for all students
>> foreach var of varlist * {
>>               local beg=substr("`var'",6,2)
>>               local end=substr("`var'",-1,1)
>>       foreach letter in i p b h s e l w m f {
>>               foreach num of numlist 3/4 9/11 {
>>                       cap drop c`letter'00`num'`beg'08`end'
>>                       cap drop c`letter'0`num'`beg'08`end'
>>                       cap drop c`letter'00`num'`beg'07`end'
>>                       cap drop c`letter'0`num'`beg'07`end'
>>               }
>>       }
>> }
>>
>> Any help from list members would be greatly appreciated.
>>
>> I'm using Stata SE 11.1.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index