Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: RE: Use 2 variables to gen 10 new variables

 From daniel klein To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Use 2 variables to gen 10 new variables Date Thu, 28 Jul 2011 11:22:26 +0200

```Jonathan,

this still seems to be the same problem as in
http://www.stata.com/statalist/archive/2011-07/msg00868.html and
earlier in http://www.stata.com/statalist/archive/2011-07/msg00718.html.

Nick has already pointed out, that this hole thing seems very ad hoc,
and I guess it is very error-prone, as I mentioned before. I think you
really need to think about (i) the concepts of a dataset, variables,
observations, frequencies, and, of course, (ii) the underlying
problem. If you tell us exactly _why_ you want to do, what you are
asking for, someone might xome up with a more convenient way to do it.

I would like to demostrate waht I mean by "think about concepts of
variables and freqencies" and "error-prone". Consider your own example

. tab  q5a

q5a |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         12       *         *
2 |         72       *         *
3 |         29       *         *
4 |         22       *         *
5 |         67       *         *

------------+-----------------------------------
Total |        202      100.00

. tab  q5b

q5b |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         22        *       *
2 |        109        *       *
4 |         37        *       *
5 |         18        *       *

------------+-----------------------------------
Total |        186       100.00

Putting these into one variable, holding only two values, as you want
to do, you will get

. tab  new_q5_1

new_q5_1 |      Freq.     Percent        Cum.
------------+-----------------------------------
12 |       1       *       *
22 |       1       *       *
------------+-----------------------------------
Total |       2       100.00

. tab  new_q5_3

new_q5_3 |      Freq.     Percent        Cum.
------------+-----------------------------------
0 |       1       *       *
29 |       1       *       *
------------+-----------------------------------
Total |       2       100.00

As you see, since Stata sorts the values when tabulating, in new_q5_1
the first row will correspond to the frequencies of goup a, while in
new_q5_3 the first row (i.e. value 0) will be the frequency of group b

This is highly confusing and you will probably not be able to tell
which value correspondsto which group.

Best
Daniel
