Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: St: Dropping variables with mostly missing values


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: st: St: Dropping variables with mostly missing values
Date   Fri, 07 Feb 2014 15:40:57 -0500

To drop all variables missing more than 80% of the time:

foreach V of varlist _all {
	count if !mi(`V')
	drop if r(N)/_N < 0.2
}


This works for string and numeric variables. Change 0.2 to whatever level you want.

hth,
Jeph




On 2/7/2014 3:11 PM, Eric M. Uslaner wrote:
I know that this has been discussed before, but a long search doesn't find a solution for me (my own fault in searching, most likely).

I have a data set (not my own) with 161 cases over a long time period.  But  most of the variables are largely made up of missing values (information wasn't available a long time ago).  I have used Nick Cox's dropmiss (from SSC) to drop variables with all missing values.  But a large number of variables remain with few observations.  I would like to delete any variable with fewer than 20 cases.  But I can't figure out how to do this (especially since I have a large number of variables, most of which have very few cases).  Any help would be appreciated.

Ric Uslaner




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index