[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: dealing with multiple alphanumeric responses |

Date |
Tue, 2 Jul 2002 08:51:26 +0100 |

Pooja Gupta wrote > > > one of my > > variables has multiple alphanumeric characters that are not > > seperated by commas. > > for eg, the first five observations of the variable are > > > > 1. ABC > > 2. ABCEG > > 3. BDEGHI > > 4. ACDFGI > > 5. AHI > > > > can a write a code which allows me to do a tabulation of each > > of these alphabets > > (i.e., how many As, how many B, how many C and so on) ? and Tom Steichen suggested > > Something of the form > > . for any A B C D E F G H I: gen v_X=index(var, "X") \ replace > v_X=1 if v_X>1 > > where A B C D E F G H I is the list of possible alpha characters > and var is the variable of interest > > will generate individual numeric (0,1) variables for each alpha code > that can then be tabulated with the usual tabulation commands. > > Tom > There's a small slip in Tom's code here. He meant . for any A B C D E F G H I: gen v_X=index(var, "X") \ replace v_X=1 if v_X>0 because otherwise all occurrences in the first column will be ignored. In fact, his code can be telescoped: . for any A B C D E F G H I: gen v_X=index(var, "X") > 0 That still leaves several variables, which as said can be tabulated one by one, but you might want something more compact. Here's another way to approach it. I assume string variable -v-. 1. -save- the data set if not already saved. 2. -trim()- any spaces: replace v = trim(v) 3. calculate the length of each string: gen l = length(v) 4. record obs number gen long obs = _n 5. -expand- using -l- expand l 6. -sort- and take each character bysort obs: gen str1 char = substr(v,_n,1) 7. -tabulate- results tab char 8. -save- this data set if needed in future 9. return to original data set Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: RE: dealing with multiple alphanumeric responses [correction]***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**st: RE: RE: RE: dealing with multiple alphanumeric responses***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**st: RE: dealing with multiple alphanumeric responses***From:*"Steichen, Thomas" <STEICHT@RJRT.com>

- Prev by Date:
**st: RE: merge...** - Next by Date:
**st: RE: RE: RE: dealing with multiple alphanumeric responses** - Previous by thread:
**st: RE: dealing with multiple alphanumeric responses** - Next by thread:
**st: RE: RE: RE: dealing with multiple alphanumeric responses** - Index(es):

© Copyright 1996–2023 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |