Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: RE: suggestion for Stata 8: value labels for string variables


From   "HealthMaps" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: RE: RE: suggestion for Stata 8: value labels for string variables
Date   Fri, 25 Oct 2002 10:10:31 -0700

Thanks for taking the time to work this out; this is a big help!
The problem with this is that it does add a lot to the size of my file, but
.... I got 2 gigs of memory and lots of disk space. Again, thank you ...

Richard Hoskins

-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Nick Winter
Sent: Friday, October 25, 2002 10:04 AM
To: [email protected]
Subject: st: RE: RE: RE: suggestion for Stata 8: value labels for string
variables


It seems to me that the answer here is to store the "labels" in a
separate string variable.  Then you can use the short name variable or
the long name variable (ie, the "value" or the "label") as you like.

This is, I think, the direction that Nick Cox is going here.

To do this, and apply it across multiple files, first create a stata
dataset with two string variables:  the short "value" variable, and a
longer, "label" variable. This should have one record per label.  (You
could probably read this using -infix- directly from your SAS proc
format text file.

	. clear

	. infix str4 cause 2-5 str80 label 9-89 using labels.txt
	(13 observations read)

(You will need to play with the column positions given your SAS PROC
FORMAT file, but this gives you the idea...)

	. list


         cause
label
  1.      A391                                Waterhouse Friderichsen
syndrome'
  2.      A483                                            Toxic shock
syndrome'
  3.      A985                              Hemorrhagic fever w renal
syndrome'
  4.      B222                                        HIV dis wasting
syndrome'
  5.      B230                                         Act HIV infect
syndrome'
  6.      D469                                Myelodysplastic syndrome,
unspec'
  7.      D593                                       Hemolytic uremic
syndrome'
  8.      D65'  Diseminated intravascular coagulation [defibrination
syndrome]'
  9.      D762                           Hemophagocytic syndrome, infect
assoc'
 10.      D814                                               Nezelofs
syndrome'
 11.      D820                                        Wiskott Aldrich
syndrome'
 12.      D821                                             Di Georges
syndrome'
 13.      D824                            Hyperimmunoglobulin E [IgE]
syndrome'

	. replace label=substr(label,1,index(label,"'")-1) if
index(label,"'")
	(13 real changes made)

	. replace cause=substr(cause ,1,index(cause,"'")-1) if
index(cause,"'")
	(1 real change made)

This gets rid of the trailing single quote characters in each
variable...

	. list

         cause
label
  1.      A391                                Waterhouse Friderichsen
syndrome
  2.      A483                                            Toxic shock
syndrome
  3.      A985                              Hemorrhagic fever w renal
syndrome
  4.      B222                                        HIV dis wasting
syndrome
  5.      B230                                         Act HIV infect
syndrome
  6.      D469                                Myelodysplastic syndrome,
unspec
  7.      D593                                       Hemolytic uremic
syndrome
  8.       D65  Diseminated intravascular coagulation [defibrination
syndrome]
  9.      D762                           Hemophagocytic syndrome, infect
assoc
 10.      D814                                               Nezelofs
syndrome
 11.      D820                                        Wiskott Aldrich
syndrome
 12.      D821                                             Di Georges
syndrome
 13.      D824                            Hyperimmunoglobulin E [IgE]
syndrome

	. sort cause

	. save labelfile, replace
	(note: file labelfile.dta not found)
	file labelfile.dta saved

Sort on the cause variable, and save the master labelling file.

	. use mydata
	. sort cause
	. merge cause using labelfile , nokeep
	. list cause age label

        cause        age
label
  1.      A391         71                                Waterhouse
Friderichsen syndrome
  2.      A483         32
Toxic shock syndrome
  3.      A985         45                              Hemorrhagic fever
w renal syndrome
  4.      B222         85                                        HIV dis
wasting syndrome
  5.      B230         91                                         Act
HIV infect syndrome
  6.      D469         56                                Myelodysplastic
syndrome, unspec
  7.      D593         74
Hemolytic uremic syndrome
  8.       D65         44  Diseminated intravascular coagulation
[defibrination syndrome]
  9.      D762         58                           Hemophagocytic
syndrome, infect assoc
 10.      D814         65
Nezelofs syndrome
 11.      D820         69                                        Wiskott
Aldrich syndrome
 12.      D821         72                                             Di
Georges syndrome
 13.      D824         85                            Hyperimmunoglobulin
E [IgE] syndrome


. table cause, c(mean age) stubwidth(40)

-----------------------------------------------------
                                   cause |  mean(age)
-----------------------------------------+-----------
                                    A391 |         71
                                    A483 |         32
                                    A985 |         45
                                    B222 |         85
                                    B230 |         91
                                    D469 |         56
                                    D593 |         74
                                     D65 |         44
                                    D762 |         58
                                    D814 |         65
                                    D820 |         69
                                    D821 |         72
                                    D824 |         85
-----------------------------------------------------

. table label, c(mean age) stubwidth(40)

-----------------------------------------------------
                                   label |  mean(age)
-----------------------------------------+-----------
                 Act HIV infect syndrome |         91
                     Di Georges syndrome |         72
Diseminated intravascular coagulation [d |         44
                HIV dis wasting syndrome |         85
               Hemolytic uremic syndrome |         74
   Hemophagocytic syndrome, infect assoc |         58
      Hemorrhagic fever w renal syndrome |         45
    Hyperimmunoglobulin E [IgE] syndrome |         85
        Myelodysplastic syndrome, unspec |         56
                       Nezelofs syndrome |         65
                    Toxic shock syndrome |         32
        Waterhouse Friderichsen syndrome |         71
                Wiskott Aldrich syndrome |         69
-----------------------------------------------------


--Nick Winter

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index