Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: RE: RE: suggestion for Stata 8: value labels for string variables


From   "Nick Winter" <nwinter@policystudies.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: RE: RE: suggestion for Stata 8: value labels for string variables
Date   Fri, 25 Oct 2002 13:18:39 -0400

Perhaps a way to get back the memory would be to -encode- the long
"label" variable.  If you do that in the data set of codes and labels,
then there will be a consistent encoding that will come along each time
you merge the labels into a dataset.

--Nick


-----------------------------------------------------------
 Nicholas Winter, Ph.D.                     P 202.939.5343
 Policy Studies Associates                  F 202.939.5732
 1718 Connecticut Avenue, NW     nwinter@policystudies.com
 Washington, DC 20009-1148           www.policystudies.com
----------------------------------------------------------- 

> -----Original Message-----
> From: HealthMaps [mailto:healthmaps@attbi.com] 
> Sent: Friday, October 25, 2002 1:11 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: RE: RE: RE: suggestion for Stata 8: value 
> labels for string variables
> 
> 
> Thanks for taking the time to work this out; this is a big help!
> The problem with this is that it does add a lot to the size 
> of my file, but
> .... I got 2 gigs of memory and lots of disk space. Again, 
> thank you ...
> 
> Richard Hoskins
> 
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Winter
> Sent: Friday, October 25, 2002 10:04 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: RE: RE: suggestion for Stata 8: value labels 
> for string
> variables
> 
> 
> It seems to me that the answer here is to store the "labels" in a
> separate string variable.  Then you can use the short name variable or
> the long name variable (ie, the "value" or the "label") as you like.
> 
> This is, I think, the direction that Nick Cox is going here.
> 
> To do this, and apply it across multiple files, first create a stata
> dataset with two string variables:  the short "value" variable, and a
> longer, "label" variable. This should have one record per label.  (You
> could probably read this using -infix- directly from your SAS proc
> format text file.
> 
> 	. clear
> 
> 	. infix str4 cause 2-5 str80 label 9-89 using labels.txt
> 	(13 observations read)
> 
> (You will need to play with the column positions given your SAS PROC
> FORMAT file, but this gives you the idea...)
> 
> 	. list
> 
> 
>          cause
> label
>   1.      A391                                Waterhouse Friderichsen
> syndrome'
>   2.      A483                                            Toxic shock
> syndrome'
>   3.      A985                              Hemorrhagic fever w renal
> syndrome'
>   4.      B222                                        HIV dis wasting
> syndrome'
>   5.      B230                                         Act HIV infect
> syndrome'
>   6.      D469                                Myelodysplastic 
> syndrome,
> unspec'
>   7.      D593                                       Hemolytic uremic
> syndrome'
>   8.      D65'  Diseminated intravascular coagulation [defibrination
> syndrome]'
>   9.      D762                           Hemophagocytic 
> syndrome, infect
> assoc'
>  10.      D814                                               Nezelofs
> syndrome'
>  11.      D820                                        Wiskott Aldrich
> syndrome'
>  12.      D821                                             Di Georges
> syndrome'
>  13.      D824                            Hyperimmunoglobulin E [IgE]
> syndrome'
> 
> 	. replace label=substr(label,1,index(label,"'")-1) if
> index(label,"'")
> 	(13 real changes made)
> 
> 	. replace cause=substr(cause ,1,index(cause,"'")-1) if
> index(cause,"'")
> 	(1 real change made)
> 
> This gets rid of the trailing single quote characters in each
> variable...
> 
> 	. list
> 
>          cause
> label
>   1.      A391                                Waterhouse Friderichsen
> syndrome
>   2.      A483                                            Toxic shock
> syndrome
>   3.      A985                              Hemorrhagic fever w renal
> syndrome
>   4.      B222                                        HIV dis wasting
> syndrome
>   5.      B230                                         Act HIV infect
> syndrome
>   6.      D469                                Myelodysplastic 
> syndrome,
> unspec
>   7.      D593                                       Hemolytic uremic
> syndrome
>   8.       D65  Diseminated intravascular coagulation [defibrination
> syndrome]
>   9.      D762                           Hemophagocytic 
> syndrome, infect
> assoc
>  10.      D814                                               Nezelofs
> syndrome
>  11.      D820                                        Wiskott Aldrich
> syndrome
>  12.      D821                                             Di Georges
> syndrome
>  13.      D824                            Hyperimmunoglobulin E [IgE]
> syndrome
> 
> 	. sort cause
> 
> 	. save labelfile, replace
> 	(note: file labelfile.dta not found)
> 	file labelfile.dta saved
> 
> Sort on the cause variable, and save the master labelling file.
> 
> 	. use mydata
> 	. sort cause
> 	. merge cause using labelfile , nokeep
> 	. list cause age label
> 
>         cause        age
> label
>   1.      A391         71                                Waterhouse
> Friderichsen syndrome
>   2.      A483         32
> Toxic shock syndrome
>   3.      A985         45                              
> Hemorrhagic fever
> w renal syndrome
>   4.      B222         85                                     
>    HIV dis
> wasting syndrome
>   5.      B230         91                                         Act
> HIV infect syndrome
>   6.      D469         56                                
> Myelodysplastic
> syndrome, unspec
>   7.      D593         74
> Hemolytic uremic syndrome
>   8.       D65         44  Diseminated intravascular coagulation
> [defibrination syndrome]
>   9.      D762         58                           Hemophagocytic
> syndrome, infect assoc
>  10.      D814         65
> Nezelofs syndrome
>  11.      D820         69                                     
>    Wiskott
> Aldrich syndrome
>  12.      D821         72                                     
>         Di
> Georges syndrome
>  13.      D824         85                            
> Hyperimmunoglobulin
> E [IgE] syndrome
> 
> 
> . table cause, c(mean age) stubwidth(40)
> 
> -----------------------------------------------------
>                                    cause |  mean(age)
> -----------------------------------------+-----------
>                                     A391 |         71
>                                     A483 |         32
>                                     A985 |         45
>                                     B222 |         85
>                                     B230 |         91
>                                     D469 |         56
>                                     D593 |         74
>                                      D65 |         44
>                                     D762 |         58
>                                     D814 |         65
>                                     D820 |         69
>                                     D821 |         72
>                                     D824 |         85
> -----------------------------------------------------
> 
> . table label, c(mean age) stubwidth(40)
> 
> -----------------------------------------------------
>                                    label |  mean(age)
> -----------------------------------------+-----------
>                  Act HIV infect syndrome |         91
>                      Di Georges syndrome |         72
> Diseminated intravascular coagulation [d |         44
>                 HIV dis wasting syndrome |         85
>                Hemolytic uremic syndrome |         74
>    Hemophagocytic syndrome, infect assoc |         58
>       Hemorrhagic fever w renal syndrome |         45
>     Hyperimmunoglobulin E [IgE] syndrome |         85
>         Myelodysplastic syndrome, unspec |         56
>                        Nezelofs syndrome |         65
>                     Toxic shock syndrome |         32
>         Waterhouse Friderichsen syndrome |         71
>                 Wiskott Aldrich syndrome |         69
> -----------------------------------------------------
> 
> 
> --Nick Winter
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index