Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: programatically dropping variables that don't actually vary


From   Richard Goldstein <richgold@ix.netcom.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: programatically dropping variables that don't actually vary
Date   Thu, 09 Aug 2012 15:16:33 -0400

actually, I think that what is wanted is "if r(min)==r(max)" if one
wants a general test for lack of variation (or, of course, "if r(sd)==0")

Rich

On 8/9/12 3:13 PM, Nick Cox wrote:
> for "and" read "&"
> 
> On Thu, Aug 9, 2012 at 8:12 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> In principle, many variables could have mean 0. A safer test is that
>>
>> if r(min) == 0 and r(max) == 0
>>
>> Nick
>>
>> On Thu, Aug 9, 2012 at 8:03 PM, Sarah Edgington <sedging@ucla.edu> wrote:
>>> Jenn,
>>> There are a variety of ways you might want to do this.  What I would do is
>>> something like the following:
>>>
>>> foreach var of varlist dummy1-dummyn {
>>>         sum `var',  meanonly
>>>         if r(mean)==0 {
>>>                 drop `var'
>>>         }
>>> }
>>>
>>> This cycles through each of your variables (substitute your actual variable
>>> list for "dummy1-dummyn").  For reach variable it calculates the mean.  The
>>> drop statement in the if loop only gets executed if the value stored in
>>> r(mean) is 0.
>>>
>>> -Sarah
>>>
>>>
>>> -----Original Message-----
>>> From: owner-statalist@hsphsun2.harvard.edu
>>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Earl, Jennifer
>>> Suzanne - (jenniferearl)
>>> Sent: Thursday, August 09, 2012 11:46 AM
>>> To: statalist@hsphsun2.harvard.edu
>>> Subject: st: programatically dropping variables that don't actually vary
>>>
>>> Hi,
>>>
>>> I am working with a large number of dummy variables and using collapse to
>>> create derivative datasets that are the frequencies of 1's for each dummy
>>> variable (a couple of hundred through foreach loops). I want to drop any of
>>> the dummy variables that never had a 1 (so mean(dummy1)==0, or
>>> max(dummy)==0) but it seems that drop only lets you use an if statement to
>>> drop observations, but not an if statement to drop variables.
>>>
>>> My best guess is to use a list means to create a list of the variable names
>>> that can be stored in a local and then fed into a drop command, but can't
>>> seem to make that work either since I only want the list of variable names
>>> that have a mean of 0. Or maybe transpose the dataset, drop then since the
>>> variables are now observations, and transpose back? Another solution would
>>> be save through StatTansfer and use it's drop constants feature, and then
>>> bring the data back in, but there must be an easier way.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index