Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Re: -encode- help..


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Re: -encode- help..
Date   Thu, 20 Nov 2008 21:10:13 -0000

This procedure, and sequential application of -encode- as earlier
mentioned, leave open a small problem: the set of value labels produced
need not end up alphabetically ordered, which may offend some ideas of
tidiness.  

Sorting that out is also programmable. The resulting utility can also be
used to clone a set of value labels that is already alphabetically
ordered; or to shift the origin to 0 or any other integer, as separately
requested by Moleps. 

Here is a utility in that spirit. 

*! 1.0.0 NJC 20 Nov 2008
program labvalsort  
	version 8 
	syntax namelist(min=2 max=2)  [, start(int 1) ] 
	tokenize "`namelist'" 
	args exist new 
	
	capture label list `new' 
	if _rc == 0 { 
		di as err "labels `new' already exist" 
		exit 198 
	} 

	label list `exist' 

	preserve 

	uselabel `exist' 
	tempfile file
	tempname out 
	sort label 

	file open `out' using `"`file'"', w
	forval i = 1/`=_N' { 
		local I = `start' + `i' - 1 
		file write `out' "label define `new' `I' " 
		local lbl = label[`i'] 
		file write `out' `"`lbl', modify"' _n 
	}
	file close `out'

	restore 
	
	qui do `"`file'"'   
	di 
	label li `new' 
end

Here are a few examples:

. label def foo 1 "d" 2 "c" 3 "b" 4 "a"

. labvalsort foo bar
foo:
           1 d
           2 c
           3 b
           4 a

bar:
           1 a
           2 b
           3 c
           4 d

. labvalsort foo bar2 , start(0)
foo:
           1 d
           2 c
           3 b
           4 a

bar2:
           0 a
           1 b
           2 c
           3 d 

Note that _all_ this does is to produce a new set of value labels that
then can be used in a subsequent -encode-. 

To put it in a nutshell, -labvalsort- is a solution to the following
problem: 

You apply -encode- to various string variables that have overlapping
values, and you want the same set of value labels to be used in
producing a consistent set of numeric variables. But somehow or other
that set of value labels ends up out of alphabetical sequence. Thus, you
need to sort the labels. Then you can use the new sorted set in an
-encode-. (They will only by accident apply to any existing numeric
variables.) 

Nick 
n.j.cox@durham.ac.uk 

Nick Cox

I agree with Martin. Here's a quick stab at it. 

*! 1.0.0 NJC 19 Nov 2008 
program mencode
	version 8 
	syntax varlist(string) [if] [in] , stub(str) [ label(str) ] 

	if index("`stub'", "@") == 0 { 
			di as err "stub must contain @" 
			exit 198
	}
	
	// test the variable names 
	foreach v of local varlist { 
		local new : subinstr local stub "@" "`v'" 
		confirm new var `new'
	} 

	marksample touse, strok 
	qui count if `touse' 
	if r(N) == 0 error 2000 

	if "`label'" == "" local label "`: word 1 of `varlist''"
	
	// do it 
	foreach v of local varlist { 
		local new : subinstr local stub "@" "`v'" 
		encode `v' if `touse', gen(`new') label(`label') 
		qui compress `new' 
	} 
end 

Comments: 

-mencode- works on a string varlist. 

You may specify -if- or -in-. 

You must specify a -stub()-. The stub must include the character @,
which means the present varname. You should add a prefix or suffix or
both. So if your stub is "n@", the new variable names will be prefixed
by "n". -mencode- checks first that the new names implied will be OK. 

You may specify a name for the new value labels. If you don't, -mencode-
will use the name of the first variable you specify.  	

Here's an example: 

. l var?

     +---------------------------+
     | var1   var2   var3   var4 |
     |---------------------------|
  1. |    a      b      c      d |
  2. |    a      b      c      d |
  3. |    a      b      c      d |
  4. |    a      b      c      d |
     +---------------------------+

. mencode var?, stub(n@)

. l var? nvar?

     +-----------------------------------------------------------+
     | var1   var2   var3   var4   nvar1   nvar2   nvar3   nvar4 |
     |-----------------------------------------------------------|
  1. |    a      b      c      d       a       b       c       d |
  2. |    a      b      c      d       a       b       c       d |
  3. |    a      b      c      d       a       b       c       d |
  4. |    a      b      c      d       a       b       c       d |
     +-----------------------------------------------------------+

. l var? nvar?, nola

     +-----------------------------------------------------------+
     | var1   var2   var3   var4   nvar1   nvar2   nvar3   nvar4 |
     |-----------------------------------------------------------|
  1. |    a      b      c      d       1       2       3       4 |
  2. |    a      b      c      d       1       2       3       4 |
  3. |    a      b      c      d       1       2       3       4 |
  4. |    a      b      c      d       1       2       3       4 |
     +-----------------------------------------------------------+

Martin Weiss

Nick`s contribution makes me think that it is possible to automate this
in 
the fashion that you describe...

moleps islon

> So apparently no easy solution to this. The perfect solution would be
> a command that accepted a varlist, automatically generated new
> variables concatenating the old variablename with a userspecified
> _name_ and labeled the values according to a predefined labelset...
> That would also let the user set the startnumber for the codes...
> Gotta learn programming:-)
>
>>>> _______________________
>>>> ----- Original Message ----- From: "moleps islon"
<moleps2@gmail.com>
>>>> To: <statalist@hsphsun2.harvard.edu>
>>>> Sent: Wednesday, November 19, 2008 8:30 PM
>>>> Subject: st: -encode- help..
>>>>
>>>>
>>>>> I've got 30 different text variables that all have the same
possible
>>>>> values. Is there an easy way to encode all 30 variables using the
same
>>>>> label or do I have to do it manually. Also is it possible,
somehow, to
>>>>> specify stata to start encoding with tha value 0 instead of 1 ?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index