Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: multiple response question


From   Lee Sieswerda <Lee.Sieswerda@tbdhu.com>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: multiple response question
Date   Tue, 22 Oct 2002 10:50:39 -0400

I can answer part 1) of Phung's question. With regard to chi-square testing,
it isn't clear, to me anyway, exactly what you would want to test. But with
regard to part 1), The problem with multiple response data is that the
respondent can choose more than one response :)   To complicate things, some
survey houses follow what they tell me is a marketing practice of entering
the data in the order that the respondent gave his or her responses. One way
to analyze these data is to create dummies - one dummy per possible
response. SPSS analyzes multiple response data, but it seems to store the
dummies as internal variables that you cannot get access to. Thus, one is
restricted to using commands that support "multiple response sets". This
seems typical of SPSS's closed approach. 

I started a thread on analyzing this sort of data quite a few months ago. As
Nick Cox (who seems to have a memory like a steel trap) reminded me not so
long ago, I had promised to post a command I wrote to generate dummy
variables from multiple response data. I've been negligent in doing that up
to now because I've been having trouble making the summary table look pretty
(and because I haven't run into any more of this kind of data to motivate
further action). Aside from this aesthetic touch, the command seems to work.
I've called it -mrdum- and there is a help file. If you are less interested
in creating dummy variables, Eric Zbinden's -zb_qrm- analyzes aspects of
response order in multiple response data. Eric's command also treats missing
data slightly differently.

If anyone has any suggestions for making the code better, feel free to write
to me privately. I would appreciate it. I'll integrate the code suggestions
and then send the more aesthetically pleasing version of this command to Kit
Baum to place on SSC.

Command -mrdum- and help file is below (make sure to correct for any lines
wrapped by email program).

Best regards,

Lee

Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
999 Balmoral Street
Thunder Bay, Ontario
Canada  P7B 6E7
Tel: +1 (807) 625-5957
Fax: +1 (807) 623-2369
Lee.Sieswerda@tbdhu.com
www.tbdhu.com

********begin mrdum.ado***************
*! version 1.0 Lee Sieswerda 14 July 2002

prog define mrdum, rclass
	version 7.0
	syntax varlist [if] [in], Stub(string) [RESponses(numlist sort)
LABels]
	tempvar touse
	mark `touse' `if' `in'
	quietly count if `touse'
	local total = r(N)

	return local varlist `varlist'

	if "`responses'" ~= "" {

		* Check to see that only integer responses are specified
		numlist "`responses'", integer

		* How many variables and possible responses?
		local nvar : word count `varlist'
		local nresp : word count `responses'
		return local responses `nresp'

		* Get labels
		if "`labels'" ~= "" {
		tokenize `varlist'
		local variable `"`1'"'
			foreach num of numlist `responses' {
				local lab`num': label (`variable') `num' 26
			}
		}

		di " "
		di as text "{hline 30}{c TT}{hline 25}"
		di as text "{col 31}{c |}{col 33}# cases with"
		di as text "{col 31}{c |}{col 33}code present"
		di as text "Response code{col 31}{c |}{col 33}at least
once{col 50}Percent"
		di as text "{hline 30}{c +}{hline 25}"

		* Mark cases that are completely missing
		quietly egen `stub'_miss = rmiss(`varlist') if `touse'
		quietly replace `stub'_miss = int(`stub'_miss/`nvar') if
`touse'
		quietly gen _temp00mis = 1
		quietly replace _temp00mis = . if `stub'_miss==1
		markout `touse' _temp00mis
		drop _temp00mis

		* Generate the new dummy variables, one for each possible
response
		foreach num of numlist `responses' {
			quietly egen `stub'_r`num' = eqany(`varlist') if
`touse', values(`num')
			quietly replace `stub'_r`num' = . if ~`touse' /*
`stub'_miss==1 */

			* Store cell numbers as locals
			quietly count if `stub'_r`num' == 0
			local _r`num'0 = r(N)
			quietly count if `stub'_r`num' == 1
			local _r`num'1 = r(N)
			local _r`num'tot = `_r`num'0' + `_r`num'1'
			global non_____miss `_r`num'tot'   /* <=== This
could be better; couldn't get display to work otherwise. Global dropped at
end of program. */
			local _r`num'per = `_r`num'1'/`_r`num'tot'*100

			di in text `num' `": `lab`num'' {col 31}{c |}{col
33}"' as result  `"`_r`num'1' {col 50}"'  round(real(`"`_r`num'per'"'),.01)
		}
	}
	else {
		* How many variables and possible responses (in this case,
both the same)?
		local nvar: word count `varlist'
		local nresp : word count `varlist'
		return local responses `nresp'

		* Get labels
		if "`labels'" ~= "" {
		tokenize `varlist'
		local variable `"`1'"'
			forval i=1/`nresp' {
				local lab`i': label (`variable') `i' 26
			}
		}


		di " "
		di as text `"Assuming that responses are coded from 1 to
`nresp'"'
		di " "
		di as text "{hline 30}{c TT}{hline 25}"
		di as text "{col 31}{c |}{col 33}# cases with"
		di as text "{col 31}{c |}{col 33}code present"
		di as text "Response code{col 31}{c |}{col 33}at least
once{col 50}Percent"
		di as text "{hline 30}{c +}{hline 25}"

		* Mark cases that are completely missing
		quietly egen `stub'_miss = rmiss(`varlist') if `touse'
		quietly replace `stub'_miss = int(`stub'_miss/`nvar') if
`touse'
		tempvar _tempmis
		quietly gen `_tempmis' = 1
		quietly replace `_tempmis' = . if `stub'_miss==1
		markout `touse' `_tempmis'
			

		* Generate the new dummy variables, one for each possible
response
		forval i=1/`nresp' {
			quietly egen `stub'_r`i' = eqany(`varlist') if
`touse', values(`i')
			quietly replace `stub'_r`i' = . if ~`touse'
			
			* Store cell numbers as locals
			quietly count if `stub'_r`i' == 0
			local _r`i'0 = r(N)
			quietly count if `stub'_r`i' == 1
			local _r`i'1 = r(N)
			local _r`i'tot = `_r`i'0' + `_r`i'1'
			global non_____miss `_r`i'tot'   /* <=== This could
be better; couldn't get display to work otherwise. Global dropped at end of
program. */
			local _r`i'per = `_r`i'1'/`_r`i'tot'*100
			
			di in text `i' `": `lab`i'' {col 31}{c |}{col 33}"'
as result  `"`_r`i'1' {col 50}"'  round(real(`"`_r`i'per'"'),.01)
		}
	}
	
	global non_____miss_____per = round($non_____miss/`total'*100,.01)
	local miss = `total'-$non_____miss
	di as text "{hline 30}{c BT}{hline 25}"
	di as text `"Cases with at least one response `if': "' as result
`"$non_____miss ($non_____miss_____per %)"'
	di as text `"              Completely missing `if': "' as result
`miss'
	di as text " "
	di as text `"                     Total cases `if': "' as result
`total'
	di " "
	di as text `"Variables created for `nresp' possible responses + 1
for missing"'
	macro drop non_____miss non_____miss_____per
	
end
********end mrdum.ado************

********begin mrdum.hlp**********
{smcl}
{* 14jul2002}{...}
{hline}
help for {hi:mrdum}       {right:version 1.0 14 July 2002}
{hline}

{title:Creation of dummy variables for multiple response data}

{p 8 14}{cmd:mrdum} [{cmd:if} {it:exp}]
	[{cmd:in} {it:range}] {cmd:,}
	{cmdab:s:tub(}{it:string}{cmd:)}
	[{cmdab:res:ponses(}{it:numlist}{cmd:)} {cmdab:lab:els}]


{title:Description}

For general use, {cmd:mrdum} searches across {it:varlist} for integer codes
and creates a corresponding binary dummy variable. The dummy is equal to one
if the integer code was found anywhere in {it:varlist}, zero if not, and 
missing if all of {it:varlist} is missing. It also displays a table 
summarizing the results.

This program was created specifically to deal with survey questions wherein 
the respondent can give multiple responses to a single question (e.g."Check 
all that apply"). Sometimes these data are coded as a series of variables 
with the responses entered in the order that the respondent indicated them. 
Often, however, it is useful to instead have a set of binary dummy variables

that indicate whether the respondent indicated a particular response
regardless
of the order in which it was indicated.

If you are interested in the order that responses were indicated see Eric
Zbinden's {cmd:zb_qrm}.


{title:Options}

{p 0 4}{cmd:stub(}{it:string}{cmd:)} specifies a stub for the resulting
dummy variables. The stub {break}
should be short enough for the complete names all to be legal. This is
{break}
a non-optional option. {p_end}

{p 0 4}{cmd:responses(}{it:numlist}{cmd:)} allows the user to specify
exactly which responses {break}
integer codes) he or she is interested in. If this option is not {break}
specified, the command will assume that the responses are coded from {break}
1 to n, where n is the number of variables specified. {p_end}

{p 0 4}{cmd:labels} prints out the value labels beside the response codes.
The {break}
labels are derived from the first variable of {it:varlist}.

{title:Examples}

{p 8 12}{inp:. mrdum f4m1-f4m7, stub(q4)}{break}
This produces seven dummy variables for responses coded 1-7 {break}
plus a dummy to indicate which cases are completely missing. {p_end}

{p 8 12}{inp:. mrdum f4m1-f4m7, stub(q4) res(1/4,7)}{break}
This produces four dummy variables for responses coded 1-4, {break}
one dummy for responses coded 7, and one to indicated which {break}
cases are completely missing. {p_end}


{title:Author}
{p 8 8 8}Lee E. Sieswerda {break}
Thunder Bay District Health Unit {break}
Lee.Sieswerda@tbdhu.com

  Manual:  {hi:[R] numlist, [U] 14.1.8 numlist, [R] egen, eqany}
{p 1 19}On-line:  help for {help numlist}; {help egen}{p_end}
See Also: help for {help zb_qrm} if installed
*********mrdum.hlp**************





> -----Original Message-----
> From:	Phung Lang [SMTP:plang@ifspm.unizh.ch]
> Sent:	Tuesday, October 22, 2002 9:35 AM
> To:	statalist@hsphsun2.harvard.edu
> Subject:	st: multiple response question
> 
> I have a question which consists of two parts (A and B) in which multiple
> responses are allowed for each part. (The 9 possible choices in part A are
> identical to those in B). 1) How can I get STATA to compute the
> frequencies
> so that the total percentages of the 9 choices (variables) equal 100%, for
> each part.  2) Using the new data generated from question 1, how can I get
> STATA to calculate the chi square to compare the same variable in Parts A
> and B? Because this is a multiple response question, I am assuming that I
> cannot simply state that "1" equals selected and "0", not selected.  Or
> can
> I?
> 
> Any advice would be greatly appreciated. Thanks!
> 
> Phung
> 
> Institute for Social and Preventive Medicine (ISPM)
> at the University of Zurich
> Sumatrastrasse 30
> CH-8006 Zurich, Switzerland
> 
> Tel: +41-1-634 46 72 /13
> Fax: +41-1-634 49 84
> email: plang@ifspm.unizh.ch
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index