Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Re: dropping variables with lots of missing |

Date |
Thu, 7 Jul 2011 15:59:33 +0100 |

What you are doing wrong is misunderstanding -if-. -keep- allows -if- conditions, which it evaluates for each observation in turn. With your syntax, -keep- looks at observation 1 and asks "is it true that r(K_uniq) is 4 or more?" and in your situation that is evidently true when you think about observation 1. Actually knowing observation 1 makes no difference to the question "is r(K_uniq) 4 or more?", but Stata is not concerned about that. You could equally ask ... if 2 == 2 or ... if strpos("Eric Uslaner"", "Eric") for which the answer is true when you think about observation 1. And so on for every observation. Stata is just applying the condition you asked it to apply. Less obliquely put, r(K_uniq) is a property of the entire dataset and can't discriminate between observations or variables. I guess what you want is foreach var of varlist trustbanks-angergovtintervene { misstable sum `var' if r(K_uniq) < 4 drop `var' } -- which is quite different. Now, turning to -dropmiss- and -nmissing-, Statalist protocol is to explain _where_ they come from, and _who_ is not _where_! These are the latest versions; my guess is that yours are some years out-of-date. SJ-5-4 dm67_3 . . . . . . . . . . Software update for nmissing and npresent (help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox Q4/05 SJ 5(4):607 now produces saved results SJ-8-4 dm89_1 . . . . Dropping variables or observations with missing values (help dropmiss if installed) . . . . . . . . . . . . . . . N. J. Cox Q4/08 SJ 8(4):594 update in style and content; added a new force option -nmissing- is closer to your intent, but you might as well keep going with -misstable-. Nick n.j.cox@durham.ac.uk Eric Uslaner I've been following the threads of dropping variables with missing values--I know and have used Nick Cox's dropmiss for a long time. But I also want to drop variables with lots of missing values. I could not figure out how to save any results from his nmissing, but I tried this in a do-file with misstable: foreach var of varlist trustbanks-angergovtintervene { misstable sum `var' keep if r(K_uniq) >= 4 } and it winds up deleting all cases for all variables (only three variables have less than 4 observations). What am I doing wrong? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: Re: dropping variables with lots of missing***From:*Nick Cox <n.j.cox@durham.ac.uk>

**References**:**st: Re: dropping variables with lots of missing***From:*"Eric Uslaner" <euslaner@gvpt.umd.edu>

- Prev by Date:
**st: standardized confidence intervals using mlogit** - Next by Date:
**st: Complex survey design and split population model** - Previous by thread:
**st: Re: dropping variables with lots of missing** - Next by thread:
**st: RE: RE: Re: dropping variables with lots of missing** - Index(es):