Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Encode string variables without following the default alphanumeric ordering


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: RE: Encode string variables without following the default alphanumeric ordering
Date   Mon, 9 Sep 2013 16:07:21 +0000

Adama,

If your string variables have information that would suggest a numeric code (as in the example you provided), then I would suggest that you not use -encode- but instead something like the following:

. gen numeric_activity=real(word(string_activity,1))

This should work if there is always a space after the number at the beginning of the string.  Other methods will be needed if the format is different or varies.

Once you have the numeric variable as desired you can create your own value labels which you can use for all 25 variables.

Regards,
Joe Canner
Johns Hopkins University School of Medicine

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Adama Konseiga
Sent: Monday, September 09, 2013 11:51 AM
To: [email protected]
Subject: st: Encode string variables without following the default alphanumeric ordering

Dear Statlisters

I am dealing with a questionnaire where one question leads to rating
an activity on a scale from 0 to 10.

There are about 25 variables from that question which are currently in
a string format, including missing and text such as "0 - Activity not
interesting at all".

I engaged in automatizing the process of encoding the 25 variables
into numeric. But neither -encode- nor super encode are doing the job
properly.
They are based on alphanumeric order of appearance of the string
values, which leads to numeric values having completely different
meanings accross the variables.

Does anyone has some suggestions how to improve my following codes.

-------

Stata version: Stata 10.1

u mydataset, clear

/*
tab activitym1, miss

br activitym1
egen tag=tag(activitym1)

list activitym1 if tag
****Confirm that the order of apperance of string values are different
in the 25 variables
*/

foreach var of varlist activitym* {
sencode `var', replace label(scores, replace)
 }
recode activitym1 (1=8) (2=5) (3=0) (4=6) (5=9) (6=2) (7=10) (8=7)
(9=4) (10=3) (11=1)
.....
-----

recode activitym4 (1=0) (2=2) (3=6) (4=7) (5=8) (6=5) (7=9) (8=10)
(9=1) (10=3) (11=4)


-- 

Adama Konseiga
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index