Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Drop variables satisfying a condition


From   "Maarten Buis" <M.Buis@fsw.vu.nl>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: Drop variables satisfying a condition
Date   Fri, 6 Jul 2007 18:37:56 +0200

Here is a program that'll do what you want (and more). Save the program in a separate file called conddropvar.ado on your hard disk. In the example change the -cd- command to point to the folder in which you stored conddropvar.ado. 

Hope this helps,
Maarten

*----------- begin program -------------
*! 1.0.0 MLB 6 Jul 2007
program define conddropvar, rclass
	syntax [varlist] [if] [in] [aweight fweight iweight] , /// 
	[                                                      ///
	ALLOBS(string)                                         ///
	MINVAR(numlist>=0 max=1)                               ///
	MINVALUES(numlist>0 integer max=1)                     ///
	STROK                                                  ///
	IGNORESTR					                   ///
	REPLACE                                                ///
	]
	
	marksample touse, strok
	qui count if `touse'
	local N = r(N)
	if `N' == 0 {
		di as error "no observations"
		exit 2000
	}	

	if "`weight'" != "" {
		local wgt  "[`weight'`exp']"
	}
	
	if `"`allobs'`minvar'`minvalues'"' == "" {
		di as error "at least one of the options allobs, minvar, and minvalues"
		di as error "must be specified"
		exit 198
	}

	if `"`strok'"' != "" & `"`minvar'"' != "" {
		di as error "options strok and minvar cannot be combined"
		exit 198
	}

	if "`strok'" != "" & "`ignorestr'" != "" {
		di ass error "options strok and ignorestr cannot be combined"
		exit 198
	}

	if "`ignorestr'" != "" {
		qui ds `varlist', not(type string)
		local varlist "`r(varlist)'"
	}
	
	if `"`allobs'"' != "" {
		if "`strok'" != "" {
			capture confirm string variable `varlist'
			if _rc {
				di as error "All variables in varlist must be strings"
				di as error "if both the strok and allobs options are specified"
				exit 198
			}
		}
		if "`strok'`ignorestr'" == "" {
			capture confirm numeric variable `varlist'
			if _rc {
				di as error "No variable in varlist may be a string"
				di as error "if the option allobs is specified and"
				di as error "neither the options strok or ignorestr are specified"
				exit 198
			}
		}
	}
		

	if `"allobs"' != "" {
		gettoken var : varlist
		local cond : subinstr local allobs "var" "`var'", all
		capture sum `var' if `cond' & `touse'
		if _rc > 0 {
			di as error "`allobs' is not a valid condition"
			exit 198
		}
	}


	if `"`minvalues'"' != "" {
 		tempvar first sort
 		gen long `sort' = _n
		gen byte `first' = 0
	}

	foreach var of varlist `varlist' {
		local drop = 0
		if `"`allobs'"' != "" {
			local cond : subinstr local allobs "var" "`var'", all
			qui count if (`cond') & `touse'
			if r(N) == `N' local drop = 1
		}
		if `"`minvar'"' != "" {
			qui sum `var' `wgt' if `touse'
			if r(Var) < `minvar' local drop = 1
		}
		if `"`minvalues'"' != "" {
			qui replace `first' = 0
			qui bys `var' `touse' : replace `first' = 1 if _n == 1 & `touse'
			qui count if `first'
			if r(N) < `minvalues' local drop = 1
		}
		if `drop' == 1 local todrop "`todrop' `var'"
	}	
	local todrop : list retokenize todrop
	
	if "`replace'" == "" {
		di as txt "the replace options was not specified so the data remain unchanged"
		di
		if "`todrop'" != "" {
			di as txt "If the replace option was also specified"
			di as txt "the following variables would have been dropped: "
			di as result "`todrop'" as txt "."
			di
			di as txt "To drop those variables repeat this command with the replace option."
		}
		else {
			di as txt "No variable would have been dropped" 
			di as txt "if the replace option was specified."
		}
		return local todrop "`todrop'"
	}
	if "`replace'" != "" {
		if "`todrop'" != "" {
			di as txt "The following variables have been dropped: "
			di as result "`todrop'" as txt "."
			drop `todrop'
		}
		else {
			di as txt "No variable has been dropped" 
		}
	}
	if "`minvalues'" != "" {
		sort `sort'
	}
end 
*------------------ end program -----------------

*---------------- begin example ---------------
set more off
local home "h"
cd "`home':\mijn documenten\projecten\stata\conddropvar"
sysuse auto, clear
*set trace on
conddropvar , allobs("var==0 | missing(var)") ignorestr
*set trace on
conddropvar , allobs("var==0 | var==1") minvalues(6) ignorestr replace
exit
*-------------------- end example -------------------

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434 

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Guillermo Villa
Sent: vrijdag 6 juli 2007 16:22
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Drop variables satisfying a condition

Hi,

Thanks, but I think the loop does not work with strings, so it stops when it
finds the first string in the dataset. Do you think it is possible to write
a similar loop for any variable format? I have a huge dataset, with about
500 variables, so this will be very helpful.

Thanks again.

Guillermo

-----Mensaje original-----
De: Guillermo Villa [mailto:guillermo@baphealth.com]
Enviado el: viernes, 06 de julio de 2007 14:33
Para: 'statalist@hsphsun2.harvard.edu'
Asunto: Drop variables satisfying a condition

> Dear statalisters,
>
> I wonder whether it is possible to drop several variables in a dataset
> specifying an "if" condition. For instance, it would be interesting
> being able to drop all variables in the dataset for which all cases
> are 0 or missing values. I have been reading the help documentation on
> the "drop" command, however I am unable to find the right answer so
> far. Note that I want to drop variables and not cases.
>
> Thanks in advance.
>
> Guillermo
>
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index