# Re: st: Proportion tests for non-binary variables

 From jpitblado@stata.com (Jeff Pitblado, StataCorp LP) To statalist@hsphsun2.harvard.edu Subject Re: st: Proportion tests for non-binary variables Date Tue, 11 Apr 2006 12:00:42 -0500

```Herve STOLOWY <stolowy@hec.fr> has two categorical variables and wants to
compare the proportions of each category between them:

> I would like to test the equality of proportions of two variables which are
> not binary. Each variable can have four values (0, 1, 2 and 3). (To be more
> precise, the original variable is the same but are applied to two different
> populations, the total sample and a restricted sample. I created two
> different variables. With -tabulate-, I get easily the frequencies of both
> variables).
>
> I can't use -prtest- and -ztest- because these two commands require, to my
> knowledge, binary variables.
>
> My comparison should work on unpaired data.
>
> I searched in Stata on "proportions" but did not find any command for that
> purpose. I missed maybe something. Would you have an idea?

I'll assume you have two variables, say -x1- and -x2-.  You could reshape your
data from wide to long and then use -tabulate- to get an association test
between the categories of your original variables.  Here is a simulated data
example.

First I'll generate some data:

. drop _all
. set seed 1234
. set obs 25
. gen x1 = int(3*uniform()) + 1
. gen x2 = int(3*uniform()) + 1

I'll use the -mean- command to do a quick summary of these variables that I
can check against after I reshape the data, the means and standard errors
should be sufficient to tell me if I did the -reshape- correctly.

. mean x1 x2

Now I'll reshape the data,

. gen i = _n
. reshape long x, i(i) j(id)

Here the -i- variable identifies the original observations, and I used the
-j()- option to get -reshape- to put the original variable id into the new
-id- variable.

I can use -mean- with the -over()- option to varify that the -id- variable
identifies reshaped categorical variables correctly.  The means and standard
errors should match exactly to those in the -mean- results above.

. mean x, over(id)

Now that the data has been reshaped, I can use -tabulate- to get a chi-square
test of association:

. tabulate x id, chi2

See -[R] tabulate twoway- for other measures/tests of association.

Cheers,

--Jeff
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```