[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Schaeper, Dr. Hildegard" <Schaeper@his.de> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: Analyzing multiple response variables with multiple categories |

Date |
Thu, 20 Feb 2003 13:33:07 +0100 |

Dear all, I am a Stata beginner (we just banished SPSS), nonetheless, I have to solve some problems in order to satisfy the needs of my institute. The problem: We often have to analyze a set of multiple response variables that all have several and exactly the same codes, e.g. courses students are enrolled in. Let's assume that our interviewees can give up to four answers. The variables named answer1, answer1, answer3, answer4 are coded in the same way, e.g.: 1 = mathematics 2 = philosophy 3 = chemistry etc. How can I get a frequency distribution and a percentage distribution (based on cases, not on answers), which takes into account analytical weights and which informs me about how many respondents (absolute numbers and percentages) are enrolled in mathematics, philosophy etc.? In short, I was looking for a Stata command that resembles the SPSS "mult response" command (sorry, but this feature of SPSS really is useful). Because I didn't succeed in finding a Stata ado which satisfies all my needs I began to do some programming. My idea was to generate a set of dummy variables, which represent each of the categories of the original variables, and then simply to compute the mean using the tabstat command (which allows for the by prefix, for the by option and weights, so that I even can produce multidimensional percentage distributions). Eureka, the program works, but has three disadvantages: First, the program is very slow, because, depending on the number of categories, a lot of dummy variables are to be generated (in my application 99). Second, instead of displaying the labels of the values of the original variables only the names of the newly created dummy variables are displayed. I succeeded in assigning the value labels of the original variables to the dummy variables, but I don't know how to tell my program that I want the variable labels to be displayed and not the variable names. Third, only means (i.e. percentages) are displayed, not frequencies. Here's my program. Can anybody give me an advice? Thanks a lot. Hilde /* beginning of the program */ program define mrtab, byable(recall) version 8 syntax varlist [if] [fweight aweight iweight] [, by(varname)] preserve marksample touse if "`exp'" ~= "" { tempvar wt gen `wt' `exp' local w "[`weight' = `wt']" } /* computing the maximum value of the variables */ tempvar max1 egen `max1'=rmax(`varlist') tempvar max2 egen `max2'=max(`max1') local maxval=`max2' /* generating the set of dummy variables */ forvalues i = 1/`maxval' { egen resp`i' = eqany(`varlist'), v(`i') } /* multiplication by 100 in order to get percentages */ local i 1 while `i' <= `maxval' { quietly replace resp`i' = resp`i' * 100 local i = `i' + 1 } /* assigning the value labels of the original variables */ /* to the dummy variables */ tokenize `varlist' local j = 1 forvalues i = 1/`maxval' { local labval`j' : label `1' `i' local j = `j' + 1 } local i 1 local j 1 while `i' == `j' & `i' <= `maxval' { label variable resp`i' "`labval`j''" local i = `i' + 1 local j = `j' + 1 } /* elimination of dummy variables which only have zeros */ forvalues i = 1/`maxval' { egen m`i'= max(resp`i') } forvalues i = 1/`maxval' { if m`i' == 0 { drop resp`i' } } if `"`by'"' == "" { tabstat resp1-resp`maxval' if `touse' `w', stat(mean count) format(%3.1f) column(statistics) longstub } else if `"`by'"' ~= "" { tabstat resp1-resp`maxval' if `touse' `w', stat(mean count) format(%3.1f) col(stat) by(`by') long } end * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: Analyzing multiple response variables with multiple categories***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**st: RE: Problem with minutes in egenmore** - Next by Date:
**st: panel: within and between dimension/correlated effects** - Previous by thread:
**Re: st: RE: Graphs** - Next by thread:
**st: RE: Analyzing multiple response variables with multiple categories** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |