# Re: st:compare distribution of a variable across groups

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st:compare distribution of a variable across groups Date Mon, 2 Feb 2009 22:30:13 +0000 (GMT)

```--- Mandy fu <mandy.fu1@gmail.com> wrote:
> I'm going to compare the effect of education on wage rates by racial
> groups. Here's what I'm thinking:
>
> step 1.
>  I will check the means,standard errors,and ranges of wages by racial
> groups.
> step 2.
>  I will check percentile wage rates,like decimal percentile, for each
> group.

Aren't you just looking for an interaction effect of education and
race, i.e. the effect of education is different for different races?

Alternatively this looks suspiciously like a set up for a
Blinder-Oaxaca decomposition, i.e. you are trying to explain part of
the race differences in income away with differences in education, and
if you are an economist then you are also willing to put some sweeping
label on the residual effect (instead of correctly stating that that is
the part of the effect that we do not understand, the economist would
boldly state that you have now an empirical estimate of
discrimination...)

> What I'd like to ask is: what is the usual way to compare the
> distribution of a variable across several groups? My concern is
> that,maybe I  would miss something if only comparing the means or
> standard errors of wages. So, I'm curious how most researchers deal
> with this .

If you want to report numbers in order to summarize the differences
between distribution, than I would stick with means (maybe medians). In
the end we are mostly interested how the central tendencies of the
groups differs (exceptions obviously exist). If you want look at more,
and this is a good idea even if you are finally end up not reporting on
it, than graphs are the solution. For example you can overlay smoothed
distributions of wage of the different races like in the example below,
and I am sure Nick can suggest many more useful graphs.

*---------------- begin example ---------------
sysuse nlsw88, clear
desc race
label list racelbl

twoway kdensity wage if race == 1 || ///
kdensity wage if race == 2 || ///
kdensity wage if race == 3,   ///
legend(order( 1 "white"       ///
2 "black"       ///
3 "other"))
*---------------- end example ------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```