Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: foreach / forvalues loop error

From   Steve Nakoneshny <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: foreach / forvalues loop error
Date   Fri, 26 Aug 2011 10:56:50 -0600

Dear Statlisters,

I have a series of variables that are entered into a head and neck cancer database as categorical variables. However, the database stores and exports them as strings. We have also pre-specified the values we wish each of these values to hold in a categorical variable. Prior to analysing the data using Stata 11.2, I need to convert this data from a string to a categorical variable. What I have done thus far is to use -encode- and then -recode- in order to recode the values to our pre-specified ones.  

The main flaw in this approach is that data contained in each of these variables will differ depending on the query entered into the database, so while -encode- will always work, I have to change the -recode- values per dataset in order to make different datasets compatible. This is too labour intensive and heavily error prone.

I have explored -egen newvar = group(varlist), label- as an alternative, except that it too is limited by the context of a given database query and I am unsure how to remap the values and value labels to match our specifications.

To move beyond the query context issue and write our code to be useful in a production environment (i.e. repetitive and consistent), I have written ~15 lines of code that will do what I need. Here is one of the variables:

generate primsitenum = 1 if Primary_site=="Oral Cavity"
replace primsitenum = 2 if Primary_site=="Oropharynx"
replace primsitenum = 3 if Primary_site=="Hypopharynx"
replace primsitenum = 4 if Primary_site=="Larynx"
replace primsitenum = 5 if Primary_site=="Nasopharynx"
replace primsitenum = 6 if Primary_site=="Paranasal/Nasal"
replace primsitenum = 7 if Primary_site=="Skin"
replace primsitenum = 8 if Primary_site=="Salivary Gland"
replace primsitenum = 9 if Primary_site=="Unknown Primary"
replace primsitenum = 10 if Primary_site=="Thyroid"
replace primsitenum = 11 if Primary_site=="Other Site"

label variable primsitenum "Primary Site Numerically Coded"
label define primsitelab 1 "Oral Cavity" 2 "Oropharynx" 3 "Hypopharynx" 4 "Larynx" 5 "Nasopharynx" 6 "Paranasal Sinus" 7 "Skin"  8 "Salivary Gland" 9 "Unknown Primary" 10 "Thyroid" 11 "Other Site" 99 "Not Stated"
label values primsitenum primsitelab

As inelegant as this is, it works. The structure of this code leads me to think that it can be rewritten far simpler using a foreach loop. I am a relative neophyte with Stata and have just begun to explore the use of -foreach- and -forvalues-. I've written the following code as a first attempt:

gen primsitenum = .

local x Oral Cavity Oropharynx Hypopharynx Larynx Nasopharynx Paranasal Sinus Skin Salivary Gland Unknown Primary Thyroid Other Site

forvalues n = 1/11 {
	replace primsitenum = `n' if Primary_site =="`x'"

This loop executes without any errors, but does not produce any usable output (I get 11 consecutive remarks of "0 real changes made"). I feel that I'm close to solving this problem, but I clearly don't know enough to cross the finish line. I've searched extensively in -help- and online, but nothing I've read thus far has provided me with a "Eureka!" moment. Any assistance would be greatly appreciated.

Steve Nakoneshny
Research Assistant
Ohlson Research Initiative
University of Calgary - Faculty of Medicine
3280 Hospital Dr. NW
Calgary AB T2N 4Z6
Tel: (403) 220-4347
Fax: (403) 270-3145

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index