> Pooja Gupta wrote
> >
> > > one of my
> > > variables has multiple alphanumeric characters that are not
> > > seperated by commas.
> > > for eg, the first five observations of the variable are
> > >
> > > 1. ABC
> > > 2. ABCEG
> > > 3. BDEGHI
> > > 4. ACDFGI
> > > 5. AHI
> > >
> > > can a write a code which allows me to do a tabulation of each
> > > of these alphabets
> > > (i.e., how many As, how many B, how many C and so on) ?
>
> and Tom Steichen suggested
> >
> > Something of the form
> >
> > . for any A B C D E F G H I: gen v_X=index(var, "X") \ replace
> > v_X=1 if v_X>1
> >
> > where A B C D E F G H I is the list of possible alpha characters
> > and var is the variable of interest
> >
> > will generate individual numeric (0,1) variables for each alpha code
> > that can then be tabulated with the usual tabulation commands.
> >
> > Tom
> >
Another way to do it:
suppose -v- is str5.
-save- data set
keep v
forval i = 1/5 {
gen str1 v`i' = substr(v,`i',1)
}
gen id = _n
reshape long v , i(id)
tab v
return to original data set
Nick
n.j.cox@durham.ac.uk
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/