Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Retaining variable labels when converting from wide to long form
From 
 
"Tim Stapleton" <[email protected]> 
To 
 
<[email protected]> 
Subject 
 
st: Retaining variable labels when converting from wide to long form 
Date 
 
Tue, 8 Jan 2013 15:19:50 +0100 
Hi there
I've converted a .dta file which was in the following format to the long
form to analyse the responses of each individual surveyed:
serial	m1_01	m1_02	m1_03	m1_04	m1_05	m1_06	m1_07	m1_08
m1_09	m1_10
1140	male	male	male
1077	male	male	male	male
1096	female	female	female	female	female	female	female	female
female	female
- serial = identifier of the interviewer, 
- m1 = the question number (there are 51 response categories)
- m1_01 = the response to question 1 of the 1st respondent the questioner
interviewed (each interviewer interviewed up to 10 respondents so there are
plenty of missing values)
- each variable is labelled e.g. m1_01 = "1. gender"
Using the command:
reshape long m1_@ m2_@ m3a1_@ m3a2_@ m3b_@ m4a_@ m4b_@ m5a_@ m5b_@ m6_@ m7_@
m8_@ m9_@ m10_@ m11a1_@ m11a2_@ m11b1_@ m11b2_@ m12_@ m13a1_@ m13a2_@ m14_@
m15a1_@ m15a2_@ m15b1_@ m15b2_@ m16_@ m17_@ m18_@ m19_@ m20_@ m21_@ m22_@
m23_@ m24_@ m25_@ m26_@ m27_@ m28_@ m29_@ m30_@ m31_@ m32_@ m33_@ m34_@
m35_@ m36_@ m37_@ m38_@ m39_@ m40_@, i(serial) j(id 01 02 03 04 05 06 07 08
09 10)
puts the data to be in the following format:
serial	id	m1_	m2_	m3a1_	...
1140	1	male
1140	2	male
1140	3	male
....
1096	1	female
1096	2	female
1096	3	female
...
 
The problem is that in the process I lose the variable labels (i.e. m1_ is
no longer labelled "1. gender" and so on).  
Does anyone know how to preserve the variable (without relabelling or
renaming each variable)? 
I notice that if I omit the specification "j(id 01...)" from the command
above, the labels are preserved but because of the naming convention used
for the variables (m1_01 to _10 and so on), running this command does not
transform the data correctly.
Also for future reference does anyone know how to write this command more
succinctly?  If I summarise the new variable names in the format recommended
in the varlist help file (e.g. "m16_@-m40_@"), I get an "implied name too
long" error.
Any help would be much appreciated.
Kind regards
Tim Stapleton
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/