Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: RE: combination foreach forvalues


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: AW: RE: combination foreach forvalues
Date   Tue, 20 Oct 2009 12:47:52 +0100

-encode- by default returns a mapping that is strictly alphabetical. As
this is what John is asking for, that is not obviously problematic. 

In any case, -encode- allows you to specify a set of labels to be used. 

What you may be referring to is that sometimes users want encoding in
order of first occurrence in the data. That can be tackled in various
ways, but John doesn't mention this. 

However, it is easy to imagine that if some of the letters a ... z do
not occur in practice, then a straight -encode- may not be what John
wants. Consider this: 

tokenize "`c(alpha)'"

forval i = 1/26 { 
	label def alphabetic `i' "``i''", modify 
}

encode stringvar, gen(numvar) label(alphabetic) 

More complicated alphabets e.g. with accents or diacritical marks
clearly require modified code. 

Nick 
n.j.cox@durham.ac.uk 

Martin Weiss

"-encode- does precisely this."

I remember there being an issue with the order of the codes that
-encode-
assigns, that is why I was reluctant to recommend it. Is that not an
issue
here?

Nick Cox

This doesn't require either -foreach- or -forvalues-. -encode- does
precisely this. 

Alternatively, 

bysort stringvar : gen newvar = _n == 1 
replace stringvar = sum(stringvar) 

or 

egen newvar = group(stringvar) 

There is not much to explain about combining -foreach- and -forvalues-,
as you just do it if and when you need it, typically by nesting one
inside the other. But that's not the case here. 

John Bunge

I have a string variable x1 with a list of values. I want to create a
numerical variable x2 in which the numbers correspond to the string
values in x1 in an ordered fashion (as a counter).

To illustrate, lets assume x1 contains all letters of the alphabet, and
I want x2 to contain a counter that corresponds to the position of the
letter in the alphabet, i.e. x1=a > x2=1, x1=b > x2=2, x1=c > x2=3,
etc...

This seems to me like a combination of foreach and forvalues, but I
cannot find information on whether and how such thing is implementable
in Stata.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index