Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Organizing non-ranked multiple responses by generating new variable

From	JD Wright <[email protected]>
To	[email protected]
Subject	st: Organizing non-ranked multiple responses by generating new variable
Date	Sat, 18 Feb 2012 13:27:29 -0800 (PST)

Hello,

I am currently organizing a data set. There is one question in particular
(see A22b below) that has an "other: specify" option that has generated
multiple responses. These responses have been coded by the original data
collectors by assigning the response a number between 1-98; each time a
respondent mentioned some new “other” it was assigned the next available
number. 

The main question that begins the question at hand is about whether one is
religious or not: 

Q. A21: Do you identify yourself with any religion? [bq1_a21_rel_ind]
1. Yes
0. No
8. Don't Know
9. Refused 

If the respondent answered "Yes" 

Then a second question is asked: 

Q. A22a. What religion? [BQ1_A22_REL_TYPE]
1. Buddhist
2. Christian
3. Hindu
4. Jewish
5. Muslim
6. Other (Specify): [this becomes a new variable in the data set:
bq1_a22_othrel and is followed by a new variable for the coded response
bq1_a22b_rel_cd1] 
[There is no option "7"]
8. Don't Know
9. Refused

If option “6. Other (Specify)” was chosen then respondents were asked to
fill in the blank. This generated 88 responses—albeit some multiple “other”
responses were organized under “Catholic” and “Mormon,” many responses were
unique.  

I went through these options and further categorized them according to the
original question Q. A22a—since many of the new answers, e.g., Adventist,
Baptist, etc. could be easily organized under the “Christian” category. This
makes more sense than leaving such options categorized under “other.” 

So my question is “How is the most efficient way to organize these?”

Here is the current approach I am using, but it seems too repetitive and
cumbersome: 

_______________________________________________________________
**[I am generating a new variable in order to organize all of these multiple
responses]
. generate bq1_a22b_rel_cd1_type = .

***Buddhist [Buddhism was mentioned in “Other” and coded as “03”
**So I proceeded to begin defining my new variable]
. replace bq1_a22b_rel_cd1 = 1 if (bq1_a22_othrel==1 & bq1_a22b_rel_cd1==3 &
bq1_a22b_rel_cd1!=.)

****Christian [Here is where it becomes more difficult 
****because there are 30 different codes that can be categorized 
*****as Christian] 
. replace bq1_a22b_rel_cd1 = 2 if (bq1_a22_othrel==1 & bq1_a22b_rel_cd1!=.)
& (bq1_a22b_rel_cd1==67 | bq1_a22b_rel_cd1==43 | bq1_a22b_rel_cd1==13 [etc.,
i.e., using “ | bq1_a22b_rel_cd1==” with each of the following codes below])

____________________________________________

[These are the codes for Christians
67,43,13,04,82,59,05,37,80,06,64,46,41,10,14,12,16,71,76,44,29,09,57,63,34,08,07,31,12,69]

I also planned on continuing this for the other categories of Q. A21 (Hindu,
Jewish, Muslim, Other) using the other coded “other” responses. 

Once that was done I planned on generating another new variable so that I
could combine answers from Q. A22a and Q. A22b in one variable that
reflected the values of the original question Q. A22a. 

Note: I have also come across posts about egen and forvalues, etc. in terms
of organizing multiple responses or organizing data, but none have addressed
an example quite like this one where there really is no order or logic to
numbers assigned and they are not necessarily sequential either. 

My knowledge of Stata is obviously limited … so I am not even sure if my
initial inclination to deal with such data by generating a new variable,
then replacing values, is even a typical approach. 

I would appreciate any guidance. Thank you, Jaime 


--
View this message in context: http://statalist.1588530.n2.nabble.com/Organizing-non-ranked-multiple-responses-by-generating-new-variable-tp7297678p7297678.html
Sent from the Statalist mailing list archive at Nabble.com.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Organizing non-ranked multiple responses by generating new variable
  - From: Nick Cox <[email protected]>

Prev by Date: st: Propensity Score Matching
Next by Date: st: weighted time dependent Cox model
Previous by thread: st: 3 level xtmelogit woes
Next by thread: Re: st: Organizing non-ranked multiple responses by generating new variable
Index(es):
- Date
- Thread