Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Copy string variable as value label


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Copy string variable as value label
Date   Mon, 25 Apr 2005 15:46:43 +0100

This problem, or at least a relative of 
it, can be attacked, I think, using Roger Newson's -sencode-. 

His solution includes a certain amount of file manipulation. 
In my version of the problem when I looked 
at it two years ago I didn't find any need 
for that, but I haven't looked closely enough to work 
out what aspects of the problem Roger solves that I 
don't or indeed vice versa. 

There doesn't seem to be a help file for my resulting program, 
but the code is a bit more general than yours. 

program seqencode, sortpreserve 
*! NJC 1.0.0 1 May 2003 
	version 8 
	syntax varname(string) [if] [in], Generate(str) [ Label(str) Unique ]

	local limit = cond(c(flavor) == "Small", 1000, 65536) 

	quietly { 
		marksample touse, strok 
		count if `touse' 
		if r(N) == 0 error 2000 
		
		// variable is new? 
		confirm new variable `generate' 
			
		// label is new? 	
		if "`label'" == "" local label "`generate'"
		capture label list `label' 
		if _rc != 111 { 
			di as err "label `label' already defined" 
			exit 110 
		} 	

		if "`unique'" != "" {
			// each value `touse' mapped to its own -label- 
			replace `touse' = -`touse' 
			sort `touse' `_sortindex'
			
			// define labels 
			count if `touse'
			if `r(N)' > `limit' error 134 
			forval i = 1 / `r(N)' { 
				label def `label' `i' ///
					`"`= `varlist'[`i']'"', modify 
			} 

			gen long `generate' = _n  if `touse' 
		} 
		else { 
			// get first occurrences  
			tempvar first 
			bysort `touse' `varlist' (`_sortindex') : ///
				gen byte `first' = -(_n == 1 & `touse')
			sort `first' `_sortindex' 

			// define labels 
			count if `first' 
			if `r(N)' > `limit' error 134 
			forval i = 1 / `r(N)' { 
				label def `label' `i' ///
					`"`= `varlist'[`i']'"', modify 
			} 
			
			// copy values from first occurrences 
			gen long `generate' = _n  if `touse' 
			bysort `touse' `varlist' (`generate'): /// 
				replace `generate' = `generate'[1]
		} 
				
		compress `generate'

		// assign labels 
		label val `generate' `label'
		label var `generate' `"`: variable label `varlist''"'
	} 	
end 

Nick 
n.j.cox@durham.ac.uk 

Friedrich Huebler
 
> When a string variable is converted to a numeric variable with
> -encode-, the numeric values follow the sort order of the string
> variable. I would like to -encode- a string variable based on the
> sort order of another variable. My original data is like this:
> 
> var   mean
> a     1.5
> b     1.2
> b     1.2
> b     1.2
> c     1.8
> c     1.8
> 
> I would like to create the variable "newvar" like this, using the
> sort order of the variable "mean":
> 
> var   mean   newvar   (label for newvar)
> b     1.2    1        b
> b     1.2    1        b
> b     1.2    1        b
> a     1.5    2        a
> c     1.8    3        c
> c     1.8    3        c
> 
> My solution is shown below. Creating "newvar" itself is simple but
> there must be a better way to assign the labels.
> 
> sort mean
> egen newvar = group(mean)
> lab def newvar 1 "temp"
> levels(newvar), local(levels)
> foreach l of local levels {
>   gen temp = ""
>   replace temp = var if newvar==`l'
>   levels(temp), local(templabel)
>   lab def newvar `l' `templabel', modify
>   drop temp
> }
> lab val newvar newvar
> 
> How can this code be improved? Thank you for your suggestions.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index