From: "Newson, Roger B" <r.newson@imperial.ac.uk>

To: <statalist@hsphsun2.harvard.edu>

Subject: st: RE: Problem with -reshape- and value labels

Date: Wed, 11 Jun 2008 19:48:34 +0100

A possible solution might involve using the descsave package (downloadable from SSC using the ssc command) to save the specifications of variable attributes (including value labels) in a do-file before the first of your reshape commands, and to execute this do-file after the last of your reshape commands. Before the first of your reshape commands, you might type tempfile df0 descsave resp*, do(`"`df0'"', replace) to create a do-file in the temporary file specified by `"`df0'"'. Then, after the last of your reshape commands, you might type run `"`df0'"' and the variables resp1-resp6 will have the variable labels, formats, value labels and storage types that they had in the original dataset, following the execution of this do-file. I hope this helps. A possible solution might involve using the descsave package (downloadable from SSC using the ssc command) to save the specifications of variable attributes (including value labels) in a do-file before the first of your reshape commands, and to execute this do-file after the last of your reshape commands. Before the first of your reshape commands, you might type tempfile df0 descsave resp*, do(`"`df0'"', replace) to create a do-file in the temporary file specified by `"`df0'"'. Then, after the last of your reshape commands, you might type run `"`df0'"' and the variables resp1-resp6 will have the variable labels, formats, value labels and storage types that they had in the original dataset, following the execution of this do-file. I hope this helps.

Roger

-----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Clyde Schechter Sent: 11 June 2008 19:13 To: statalist@hsphsun2.harvard.edu Subject: st: Problem with -reshape- and value labels I am having a problem whereby I start out with a data set that has a number of variables with some different value labels. They variables' names share a common prefix, and when I reshape the data to long format, it seems that the value label assigned to the _last_ of the variables is carried to the new variable that equals the common prefix. For example: . des Contains data obs: 10 vars: 7 size: 160 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g resp1 byte %8.0g boolean 1 resp resp2 byte %8.0g boolean 2 resp resp3 byte %8.0g boolean 3 resp resp4 byte %8.0g boolean 4 resp resp5 byte %8.0g boolean 5 resp resp6 byte %8.0g other 6 resp ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq . reshape long resp, i(seq) j(item) (note: j = 1 2 3 4 5 6) Data wide -> long ------------------------------------------------------------------------ ----- Number of obs. 10 -> 60 Number of variables 7 -> 3 j variable (6 values) -> item xij variables: resp1 resp2 ... resp6 -> resp ------------------------------------------------------------------------ ----- . des Contains data obs: 60 vars: 3 size: 720 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g item byte %9.0g resp byte %8.0g other ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq item Note: dataset has changed since last saved But the real problem arises further on: <snip> do stuff to resp variable <end snip> . reshape wide (note: j = 1 2 3 4 5 6) Data long -> wide ------------------------------------------------------------------------ ----- Number of obs. 60 -> 10 Number of variables 3 -> 7 j variable (6 values) item -> (dropped) xij variables: resp -> resp1 resp2 ... resp6 ------------------------------------------------------------------------ ----- . des Contains data obs: 10 vars: 7 size: 160 (99.9% of memory free) ------------------------------------------------------------------------ ----------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------ ----------------------------------- seq int %8.0g resp1 byte %8.0g other 1 resp resp2 byte %8.0g other 2 resp resp3 byte %8.0g other 3 resp resp4 byte %8.0g other 4 resp resp5 byte %8.0g other 5 resp resp6 byte %8.0g other 6 resp ------------------------------------------------------------------------ ----------------------------------- Sorted by: seq Notice now that the value label "other" has been spread on to all of the variables resp1-resp5 that originally had value label "boolean." This then raises problems because I later attempt to select a group of variables for some further analyses with: ds, has(vallabel boolean) which now comes up empty. I can't get around this by just moving the resp6 variable earlier in the data set: its unique value label gets singled out for the long-format prefix-named variable regardless of where it physically is in the data set. In fact, the work around seems to be to rename one of the "boolean" labeled variables to have a name that is alphabetically last. That would keep the "boolean" label from getting wiped out, but then it results in all the variables being so labeled when I reshape back to wide, so the -ds- command then traps variables that should be excluded from further analysis. Is there anyway to have -reshape- restore the original labels? (Evidently I can just relabel them by hand in this example, but the real data set I'm working with has several dozen such variables, so this starts to get impractical.) I checked the -reshape- section of the manual and I find no mention of anything about how value labels are handled. Any help would be appreciated. Thanks in advance. Clyde Schechter Albert Einstein College of Medicine Bronx, New York, USA * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

