Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: multiple use of -encode-


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: multiple use of -encode-
Date   Fri, 16 Jan 2009 19:26:28 -0000

In November Moleps Islon started a thread about -encode-. Martin Weiss,
Sergiy Radyakin and I contributed. The start was 

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.0811/date/article-859.html> 

The thread raised two main questions: 

1. How to -encode- several variables at once using the same set of value
labels. 

2. How to start a set of value labels with a value of 0. 

Although some code was posted, no very satisfactory solution was
offered. The key difficulty is that any set of value labels produced may
be untidy, meaning, not ordered alphabetically. 

I remembered the problem again when working on a different one with some
similarities. I think I now have a better solution. 

The previous code posted was called -mencode- but no help file was never
written and the code was not posted on SSC. If anyone optimistically
copied that code into their own filespace in the thought that it might
come in useful, they are advised that I consider it superseded. There is
still no help file but that will follow. I post the code because people
may have improvements to suggest. 

I am now using the name -multencode-. -mencode- is not sufficiently
self-explanatory and in any case too close to -mvencode-. 

This program is intended to solve problem 1. Problem 2 is soluble by
tweaking the code but I have to say that it doesn't interest me. 

The key idea is simply to use some Mata code to define a tidy set of
value labels ahead of the call to -encode-. 

Nick 
[email protected] 

*! 1.0.0 NJC 16 Jan 2009 
program multencode
	version 9 
	syntax varlist(string) [if] [in] , Generate(str) [ label(str)
FORCE ] 

	marksample touse, novarlist 
	qui count if `touse' 
	if r(N) == 0 error 2000 

	if "`label'" == "" local label : word 1 of `varlist' 

	if "`force'" == "" { 
		capture label list `label' 
		if _rc == 0 { 
			di as err "{p}value labels `label' already
exist; " ///
			"specify -force()- option to overwrite{p_end}" 
			exit 498 
		}
	}  			 

	local nvars : word count `varlist'
	local mylist "`varlist'" 
	local 0 "`generate'" 
	syntax newvarlist 
	local generate "`varlist'" 
	local ngen : word count `generate'

	if `nvars' != `ngen' { 
		di as err "`nvars' variables, but `ngen' new " ///
		plural(`ngen', "name") 
		exit 198 
	}

	if `nvars' == 1 { 
		encode `mylist' if `touse', gen(`generate')
label(`label')  
		exit 0 
	}

	mata : get_distinct_vals("`mylist'", "`touse'") 

	forval j = 1/`J' { 
		label def `label' `j' `"`lbl`j''"', modify 
	} 

        tokenize "`generate'" 
        local j = 1 
	foreach v of local mylist { 
		encode `v' if `touse', gen(``j'') label(`label') 
		qui compress ``j'' 
		local ++j 
	} 
end 

mata : 

void get_distinct_vals(string scalar varnames, string scalar tousename)
{
	string matrix y 
	string colvector vals 
	real scalar j 

	st_sview(y, ., tokens(varnames), tousename) 
	vals = J(0, 1, "") 
	
	for(j = 1; j <= cols(y); j++) { 
		vals = vals \ uniqrows(select(y[,j], y[,j] :!= ""))
	}

	vals = uniqrows(vals)
	_sort(vals, 1)
 
	for(j = 1; j <= rows(vals); j++) { 
		st_local("lbl" + strofreal(j), vals[j,]) 
	}                                                
	st_local("J", strofreal(rows(vals)))
}	

end



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index