Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Retaining variable labels when converting from wide to long form


From   "Tim Stapleton" <tgs1983@gmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Retaining variable labels when converting from wide to long form
Date   Tue, 8 Jan 2013 15:19:50 +0100

Hi there

I've converted a .dta file which was in the following format to the long
form to analyse the responses of each individual surveyed:

serial	m1_01	m1_02	m1_03	m1_04	m1_05	m1_06	m1_07	m1_08
m1_09	m1_10
1140	male	male	male

1077	male	male	male	male

1096	female	female	female	female	female	female	female	female
female	female

- serial = identifier of the interviewer, 
- m1 = the question number (there are 51 response categories)
- m1_01 = the response to question 1 of the 1st respondent the questioner
interviewed (each interviewer interviewed up to 10 respondents so there are
plenty of missing values)
- each variable is labelled e.g. m1_01 = "1. gender"

Using the command:

reshape long m1_@ m2_@ m3a1_@ m3a2_@ m3b_@ m4a_@ m4b_@ m5a_@ m5b_@ m6_@ m7_@
m8_@ m9_@ m10_@ m11a1_@ m11a2_@ m11b1_@ m11b2_@ m12_@ m13a1_@ m13a2_@ m14_@
m15a1_@ m15a2_@ m15b1_@ m15b2_@ m16_@ m17_@ m18_@ m19_@ m20_@ m21_@ m22_@
m23_@ m24_@ m25_@ m26_@ m27_@ m28_@ m29_@ m30_@ m31_@ m32_@ m33_@ m34_@
m35_@ m36_@ m37_@ m38_@ m39_@ m40_@, i(serial) j(id 01 02 03 04 05 06 07 08
09 10)

puts the data to be in the following format:

serial	id	m1_	m2_	m3a1_	...
1140	1	male
1140	2	male
1140	3	male
....
1096	1	female
1096	2	female
1096	3	female
...
 
The problem is that in the process I lose the variable labels (i.e. m1_ is
no longer labelled "1. gender" and so on).  
Does anyone know how to preserve the variable (without relabelling or
renaming each variable)? 

I notice that if I omit the specification "j(id 01...)" from the command
above, the labels are preserved but because of the naming convention used
for the variables (m1_01 to _10 and so on), running this command does not
transform the data correctly.

Also for future reference does anyone know how to write this command more
succinctly?  If I summarise the new variable names in the format recommended
in the varlist help file (e.g. "m16_@-m40_@"), I get an "implied name too
long" error.

Any help would be much appreciated.

Kind regards

Tim Stapleton

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index