Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: dropping variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: dropping variables
Date   Thu, 18 Mar 2004 12:00:45 -0000

Correction to the correction. This will
converge, eventually. 

Nick 
n.j.cox@durham.ac.uk 

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
> Sent: 18 March 2004 11:58
> To: statalist@hsphsun2.harvard.edu
> Subject: RE: st: dropping variables
> 
> Correction: the effect of Dimitry's loop 
> is to -drop- all string variables, 
> as r(N) is returned as 0. 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
> > Sent: 18 March 2004 11:50
> > To: statalist@hsphsun2.harvard.edu
> > Subject: RE: st: dropping variables
> > 
> > 
> > A problem with Dimitry's loop is that it will crash 
> > the first time it hits a string variable. I would 
> > tune it to 
> > 
> > foreach v of var * { 
> > 	qui count if !missing(`v') 
> > 	if r(N) < 100 drop `v' 
> > } 
> > 
> > where 100 is of course a place-holder for your own 
> > desired constant. 
> > 
> > -count- remains an under-appreciated command. 
> > 
> > Nick 
> > n.j.cox@durham.ac.uk 
> > 
> > Dimitriy V. Masterov
> >  
> > > There might be a more clever way of doing this, but here's my 
> > > solution:
> > > 
> > > /* This defines a local named variables that contains a list 
> > > with all variables */
> > > unab variables:  _all
> > > 
> > > /* This loop drops all variables that have fewer than 100 obs. */
> > > foreach var in `variables' {
> > > qui sum `var'
> > > 	if r(N)<100 {
> > > 		drop `var'
> > > 	}
> > > }
> > 
> > Eric Uslaner
> > 
> > > > I know of Nick Cox's great dropmiss program.  I want to 
> > do something
> > > > akin to it (without having to drop each variable 
> > individually).  Say
> > > > that a data set has N cases and I want to drop variables 
> > > that have fewer
> > > > than n nonmissing cases.  Theoretically I could generate 
> > > new variables
> > > > through count, but my data set is already close to the 
> > > maximum allowed
> > > > without upgrading to SE (which is why I want to drop some 
> > > variables).
> > > > Is there a way to do this:
> > > >
> > > > drop if _N < n
> > > >
> > > > or something similar?
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index