[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Newson, Roger B" <r.newson@imperial.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Problem with -reshape- and value labels |

Date |
Wed, 11 Jun 2008 19:48:34 +0100 |

A possible solution might involve using the descsave package (downloadable from SSC using the ssc command) to save the specifications of variable attributes (including value labels) in a do-file before the first of your reshape commands, and to execute this do-file after the last of your reshape commands. Before the first of your reshape commands, you might type tempfile df0 descsave resp*, do(`"`df0'"', replace) to create a do-file in the temporary file specified by `"`df0'"'. Then, after the last of your reshape commands, you might type run `"`df0'"' and the variables resp1-resp6 will have the variable labels, formats, value labels and storage types that they had in the original dataset, following the execution of this do-file. I hope this helps. Roger Roger B Newson Lecturer in Medical Statistics Respiratory Epidemiology and Public Health Group National Heart and Lung Institute Imperial College London Royal Brompton Campus Room 33, Emmanuel Kaye Building 1B Manresa Road London SW3 6LR UNITED KINGDOM Tel: +44 (0)20 7352 8121 ext 3381 Fax: +44 (0)20 7351 8322 Email: r.newson@imperial.ac.uk Web page: www.imperial.ac.uk/nhli/r.newson/ Departmental Web page: http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop genetics/reph/ Opinions expressed are those of the author, not of the institution. -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Clyde Schechter Sent: 11 June 2008 19:13 To: statalist@hsphsun2.harvard.edu Subject: st: Problem with -reshape- and value labels I am having a problem whereby I start out with a data set that has a number of variables with some different value labels. They variables' names share a common prefix, and when I reshape the data to long format, it seems that the value label assigned to the _last_ of the variables is carried to the new variable that equals the common prefix. For example: . des Contains data obs: 10 vars: 7 size: 160 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g resp1 byte %8.0g boolean 1 resp resp2 byte %8.0g boolean 2 resp resp3 byte %8.0g boolean 3 resp resp4 byte %8.0g boolean 4 resp resp5 byte %8.0g boolean 5 resp resp6 byte %8.0g other 6 resp ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq . reshape long resp, i(seq) j(item) (note: j = 1 2 3 4 5 6) Data wide -> long ------------------------------------------------------------------------ ----- Number of obs. 10 -> 60 Number of variables 7 -> 3 j variable (6 values) -> item xij variables: resp1 resp2 ... resp6 -> resp ------------------------------------------------------------------------ ----- . des Contains data obs: 60 vars: 3 size: 720 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g item byte %9.0g resp byte %8.0g other ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq item Note: dataset has changed since last saved But the real problem arises further on: <snip> do stuff to resp variable <end snip> . reshape wide (note: j = 1 2 3 4 5 6) Data long -> wide ------------------------------------------------------------------------ ----- Number of obs. 60 -> 10 Number of variables 3 -> 7 j variable (6 values) item -> (dropped) xij variables: resp -> resp1 resp2 ... resp6 ------------------------------------------------------------------------ ----- . des Contains data obs: 10 vars: 7 size: 160 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g resp1 byte %8.0g other 1 resp resp2 byte %8.0g other 2 resp resp3 byte %8.0g other 3 resp resp4 byte %8.0g other 4 resp resp5 byte %8.0g other 5 resp resp6 byte %8.0g other 6 resp ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq Notice now that the value label "other" has been spread on to all of the variables resp1-resp5 that originally had value label "boolean." This then raises problems because I later attempt to select a group of variables for some further analyses with: ds, has(vallabel boolean) which now comes up empty. I can't get around this by just moving the resp6 variable earlier in the data set: its unique value label gets singled out for the long-format prefix-named variable regardless of where it physically is in the data set. In fact, the work around seems to be to rename one of the "boolean" labeled variables to have a name that is alphabetically last. That would keep the "boolean" label from getting wiped out, but then it results in all the variables being so labeled when I reshape back to wide, so the -ds- command then traps variables that should be excluded from further analysis. Is there anyway to have -reshape- restore the original labels? (Evidently I can just relabel them by hand in this example, but the real data set I'm working with has several dozen such variables, so this starts to get impractical.) I checked the -reshape- section of the manual and I find no mention of anything about how value labels are handled. Any help would be appreciated. Thanks in advance. Clyde Schechter Albert Einstein College of Medicine Bronx, New York, USA * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Problem with -reshape- and value labels***From:*Clyde Schechter <cschecht@aecom.yu.edu>

- Prev by Date:
**st: New versions of somersd and parmest on SSC** - Next by Date:
**st: Generating values relative to a base year for several variables** - Previous by thread:
**st: Problem with -reshape- and value labels** - Next by thread:
**st: New versions of somersd and parmest on SSC** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |