Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Create String Variable out of Variable Names


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Create String Variable out of Variable Names
Date   Thu, 26 Oct 2006 12:47:30 +0100

I can make sense of (part of) this only by correcting 
the terminology. 

By "variable names" I think you mean "text or string 
values". 

Also, your code (with an implicit first line) 

set obs 100 
egen var = seq(), f(1) t(5)
replace var = "a" if var==1
replace var = "b" if var==2

just wouldn't work. -var- is born numeric and cannot
be assigned string values without intervening major
surgery to change it from numeric to string. 

This would work: 

set obs 100 
egen var = seq(), f(1) t(5)
tostring var, replace 
tokenize "a b c d e" 
forval i = 1/5 { 
	replace var = "``i''" if var == "`i'" 
} 

The examples 

. di word("`c(alpha)'", 13)
m

. di substr("abcdefghijklmnopqrstuvwxyz", 13,1)
m

show that particular letters of the alphabet 
can be selected in other ways, so that a 
mapping from numbers to letters of the alphabet
could be achieved in other ways, for example by 

forval i = 1/26 { 
	replace var = word("`c(alpha)'", `i') if var == "`i'" 
} 

or 

local alpha "abcdefghijklmnopqrstuvwxyz"
forval i = 1/26 { 
	replace var = substr("`alpha'", `i', 1) if var == "`i'" 
} 

I am afraid that I can't follow the rest of your posting, 
especially what the 64 variables (values?) are that you wish to work 
with. In particular, I can't comment on a -foreach- loop
that you don't show us. 

The package -egenmore- on SSC includes an -egen- function -repeat()-
that is an analogue of -egen-'s -seq()- function: it can produce
string variables with values like "a", "b", "c", "d", "e", "a", 
... directly. 

Nick
n.j.cox@durham.ac.uk 

Kyle C. Longest
 
> I have a set of variables that I would like to use to generate a 
> singular String variable. The number of cases in each resulting value 
> is inconsequential.
> 
> So for example if I had 5 variables (a b c d e) in a data set 
> with 100 
> observations I would like a String variable that looked 
> something like:
> 
> Var |  Frequency
> a              20
> b              20
> c              20
> d              20
> e              20
> 
> Now I know that in this simple case I could do this line by line with:
> 
> egen var = seq(), f(1) t(5)
> replace var = "a" if var==1
> replace var = "b" if var==2
> ...
> 
> But in the case of 64 variables this is unrealistically tedious, 
> especially as I am hoping to incorporate this into a more 
> encompassing 
> program to be used with different sets of variables.
> 
> The problem I've run into with -foreach- is that it loops so that in 
> the end the resulting variable looks like (using the example from 
> above).
> 
> Var | Freq
> e      100
> 
> [Also note that the variables are not mutually exclusive (ala 
> dummies), 
> so that a case could ==1 on several of the variables]
> 
> Any help with this would be greatly apprecaited,

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index