Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: programatically dropping variables that don't actually vary

From   Nick Cox <>
To   Richard Goldstein <>
Subject   Re: st: RE: programatically dropping variables that don't actually vary
Date   Thu, 9 Aug 2012 20:21:01 +0100

The title of the original post says that variables shouldn't vary and
the content says that means in practice all zero.

Evidently Jennifer or anybody else with a similar problem will need to
tweak the recipe according to the exact real problem.

By the way, -findname- can find variables with no variation at all by

findname, all(@ == @[1])

but that does not ignore missings that are not equal to the first value.


On Thu, Aug 9, 2012 at 8:16 PM, Richard Goldstein
<> wrote:
> actually, I think that what is wanted is "if r(min)==r(max)" if one
> wants a general test for lack of variation (or, of course, "if r(sd)==0")
> Rich
> On 8/9/12 3:13 PM, Nick Cox wrote:
>> for "and" read "&"
>> On Thu, Aug 9, 2012 at 8:12 PM, Nick Cox <> wrote:
>>> In principle, many variables could have mean 0. A safer test is that
>>> if r(min) == 0 and r(max) == 0
>>> Nick
>>> On Thu, Aug 9, 2012 at 8:03 PM, Sarah Edgington <> wrote:
>>>> Jenn,
>>>> There are a variety of ways you might want to do this.  What I would do is
>>>> something like the following:
>>>> foreach var of varlist dummy1-dummyn {
>>>>         sum `var',  meanonly
>>>>         if r(mean)==0 {
>>>>                 drop `var'
>>>>         }
>>>> }
>>>> This cycles through each of your variables (substitute your actual variable
>>>> list for "dummy1-dummyn").  For reach variable it calculates the mean.  The
>>>> drop statement in the if loop only gets executed if the value stored in
>>>> r(mean) is 0.
>>>> -Sarah
>>>> -----Original Message-----
>>>> From:
>>>> [] On Behalf Of Earl, Jennifer
>>>> Suzanne - (jenniferearl)
>>>> Sent: Thursday, August 09, 2012 11:46 AM
>>>> To:
>>>> Subject: st: programatically dropping variables that don't actually vary
>>>> Hi,
>>>> I am working with a large number of dummy variables and using collapse to
>>>> create derivative datasets that are the frequencies of 1's for each dummy
>>>> variable (a couple of hundred through foreach loops). I want to drop any of
>>>> the dummy variables that never had a 1 (so mean(dummy1)==0, or
>>>> max(dummy)==0) but it seems that drop only lets you use an if statement to
>>>> drop observations, but not an if statement to drop variables.
>>>> My best guess is to use a list means to create a list of the variable names
>>>> that can be stored in a local and then fed into a drop command, but can't
>>>> seem to make that work either since I only want the list of variable names
>>>> that have a mean of 0. Or maybe transpose the dataset, drop then since the
>>>> variables are now observations, and transpose back? Another solution would
>>>> be save through StatTansfer and use it's drop constants feature, and then
>>>> bring the data back in, but there must be an easier way.
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index