Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: value labels for strings: last comments


From   "Erik Ø. Sørensen" <sameos@mac.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: value labels for strings: last comments
Date   Mon, 28 Oct 2002 09:51:56 -0500

On mandag, okt 28, 2002, at 09:10 America/Montreal, baum wrote:
--On Monday, October 28, 2002 2:33 -0500 Richard wrote:
It requires several extra steps. And encoding one database results in
different encodes for another. But most generally, suppose I had several variables with the same code set? - as is the case.

The fact that '111' may refer to the 111th disease in integer variable 'diseasecodes' and the 111th zip code in 'zipcodes' is immaterial. They will still display the appropriate contents.
I think Richard's point concerns data of the "wide type"

variable list pid[long] diagn_1[str] diagn_2[str] ... diagn_N[str],
with all the diagn_X variables coded with the same codelist.

If you have a dataset
diagn[str] diagn_code[int]
with a label on the diagn_code variable, merging in the integer-coded variable with labels would have costs in terms of keeping too
many integer "copies" of the diagn_X variables. There is also the renaming and sorting to take care of, and for interactive use this is a hassle. You can throw out the original strings, but in many cases it is nice to have the short versions as well as the long.


Erik S. said he didn't want to modify a number of huge data sets, which are used by others. If Stata did have a command to crwate long labels as aliases for short labels, how would one attach the long labels without modifying the data sets on disk?
By having the label defined in a do or ado file and apply it when necessary.


The point is, Stata does have commands that will enable this feature, and they are just about as succinct as they can imaginably be.
I cannot see why you insist on this. Do you disagree that having value lables for strings would be more succint?

Also, if you have new data coming in with short codes that are unknown, it is better to see the short original string variables than having to deal with problems in the merging process. The "label" mechanism deals with this in a better way than "merge".


Erik

--
Erik Ø. Sørensen, <http://www.geocities.com/erik_oiolf/>.
phd student (economics), Norwegian School of Economics.
currently visiting Queen's University, Kingston Ontario.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index