Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: converting multiple choice (string) response options to numeric values

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: converting multiple choice (string) response options to numeric values
Date	Fri, 7 Feb 2014 13:30:27 +0000

See also on a different variant of the same problem

SJ-11-2 dm0057  . . . . . . . . .  Stata tip 99: Taking extra care with encode
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Schechter
        Q2/11   SJ 11(2):321--322                                (no commands)
        tip on safely using encode across datasets

Nick
[email protected]


On 7 February 2014 11:30, Nick Cox <[email protected]> wrote:
> This is quite a common problem, and it's easy to get bitten.
>
> label def mylabels 1 "A" 2 "B" 3 "C" 4 "D" 5 "E"
>
> foreach v of var <varlist> {
>      encode `v', gen(n_`v') label(mylabels)
> }
>
> is a sketch of how to do it. You must replace <varlist> by an actual varlist.
>
> Alternatively, as said, look at -multencode- (SSC).
>
> Nick
> [email protected]
>
>
> On 7 February 2014 09:22, Nick Cox <[email protected]> wrote:
>> Applying -encode- to several variables is a little dangerous. If the
>> values "A" to "D" occur for every variable and "E" occurs only for
>> those variables for which it is possible, and for all of them, you
>> should be fine. But suppose the only answers that occur for one
>> variable are "A", "C", "D". Then those will be, by default, mapped to
>> 1,2,3. -encode- has by default no intelligence that spots that "B" is
>> missing and decides that the appropriate coding is 1, 3, 4. You would
>> need to define value labels in advance and specify those as the labels
>> to be used.
>>
>> Note also -multencode- (SSC).
>>
>> Nick
>> [email protected]
>>
>>
>> On 7 February 2014 08:04, Ronnie Babigumira <[email protected]> wrote:
>>> encode worked just fine. What you see as the "exact same variable" is
>>> just the label
>>>
>>> *****
>>> clear *
>>> input id str1 qn1 str1 strqn3
>>> 1 A D
>>> 2 A A
>>> 3 E B
>>> 4 B C
>>> end
>>>
>>> encode qn1, g(nqn1)
>>> list
>>> list, nolabel
>>> *****
>>>
>>> Ps: note the label option of encode which allows you to provide your own label
>>>
>>> On Fri, Feb 7, 2014 at 1:59 AM, Katherine Picho <[email protected]> wrote:
>>>> I have a huge dataset which has test data with multiple choice
>>>> questions. 2 questions have choices A -E,  and the rest have 4 options
>>>> A-D
>>>>
>>>> I was looking to convert these options to numeric values with A
>>>> corresponding to 1, B=2, etc.
>>>>
>>>> I'm using stata 12.
>>>>
>>>> I tried using the egen newvar= group (oldvar) command, it seems to
>>>> work for some questions but not others. For instance the sequence of
>>>> the 1st 5 students' answers for question 18 are  AAAAA, which should
>>>> translate to 5 consecutive 1s..but I get consecutive 2s instead.
>>>>
>>>> For another test question 10, a value of 6 is reported for one
>>>> observation which actually has a letter value of C which should
>>>> correspond to a value of 3.
>>>>
>>>> I also tried encode oldvar, gen (newvar)
>>>> but I get the exact same variable data as in the original (i.e.
>>>> letters, not numbers) even though the data storage type now shows
>>>> 'long'
>>>>
>>>> I've checked to make sure there is consistency in data entry and there
>>>> appears to be; i.e. all responses are entered in capital letters, and
>>>> there is no mix of numeric and letters in the same variable/ column.
>>>>
>>>> What am I doing wrong? Any thoughts on this problem would be highly
>>>> welcome as I dread the idea of having to manually convert these
>>>> letters to numbers!
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: converting multiple choice (string) response options to numeric values
  - From: Katherine Picho <[email protected]>

References:
- st: converting multiple choice (string) response options to numeric values
  - From: Katherine Picho <[email protected]>
- Re: st: converting multiple choice (string) response options to numeric values
  - From: Ronnie Babigumira <[email protected]>
- Re: st: converting multiple choice (string) response options to numeric values
  - From: Nick Cox <[email protected]>
- Re: st: converting multiple choice (string) response options to numeric values
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Model comparison
Next by Date: Re: st: converting multiple choice (string) response options to numeric values
Previous by thread: Re: st: converting multiple choice (string) response options to numeric values
Next by thread: Re: st: converting multiple choice (string) response options to numeric values
Index(es):
- Date
- Thread