Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: accessing a variable's value WRT its attached value label


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: accessing a variable's value WRT its attached value label
Date   Wed, 29 Mar 2006 23:13:16 +0100

My impression is that you know all the pertinent
Stata commands in this area, so I am reduced
to focusing on "readability" itself. 

The question in return has to be: readable by 
whom, or more precisely by what class of program 
reader? what can be assumed of their Stata 
experience and competence? 

Examples: 

0. You don't say anything about comments. 
If inscrutable code is your concern, tuning
a comment to give an explanation at the 
right level is the simplest thing. 

1. As a Stata user who also writes programs, 
I find the use of a local macro and of an extended 
macro function unthreatening, but I still have
to work a smidgen too much to realise that 

`: val l y' 

is a minimal abbreviation of 

`: value label y'

Having also worked with highly concise 
languages, such as J, in which almost 
all primitives are one or two characters 
long, I have to say that readability 
does often hinge on avoiding minimal 
abbreviations. 

(That's why, I think, most of us do write -gen- 
even though we would save many, many 
keystrokes by writing -g-. Roger Newson
always writes -gene-, but then he was once
a real biologist, so that perhaps is why.) 

2. The construct 

"label":valuelabelname 

is prominently documented, but my impression
is that it is fairly rarely used. (Shoot that
down, readers, if you use it all the time.) 
So, what can be done here? Well, doing 
a -decode- on the fly and testing a string 
value might be much more readable
to some readers. Naturally, that is an 
extra variable at least briefly, and so forth. 

Nick 
n.j.cox@durham.ac.uk 

Phil Schumm
 
> Suppose I have a numeric variable y, and suppose it has an attached  
> value label which maps 1 -> "yes".  Now, suppose I want to count  
> those cases for which y "is" yes (I'm clearly using "is" in 
> the loose  
> sense here, but I think the meaning is clear from the context).  I  
> could of course type:
> 
> count if y == 1
> 
> however if the encoding of the variable changes (i.e., if the 
> value 1  
> no longer refers to "yes"), this will (silently) no longer give me  
> what I want.  Alternatively (and in many cases better), I could write:
> 
> count if y == "yes":y_label
> 
> where y_label is the name of the corresponding value label.  In this  
> case, if the encoding changes I'm still ok (assuming that "yes" is  
> still used in the label).  But what if the *name* of the value label  
> changes?  I could do this:
> 
> count if y == "yes":`:val l y'
> 
> but that's a bit ugly.  So, my question is, is there a more readable  
> alternative to the expression above?
> 
> For those interested in the use case, I am working on a large 
> project  
> involving a complicated data set and a lot of code to manage the  
> data.  Right now, most variables have a dedicated value label with  
> the same name as the variable itself.  However, in the 
> future, we may  
> wish to economize by replacing multiple labels that have equivalent  
> definitions with a single label.  This new label will clearly have a  
> different name than the previous label(s) for at least one of the  
> variables involved.  What I'm trying to do is to make sure that the  
> code downstream will not break if such changes are made.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index