# st: RE: RE: RE: RE: Re: Compositional data

 From "Verkuilen, Jay" To Subject st: RE: RE: RE: RE: Re: Compositional data Date Wed, 5 Mar 2008 16:35:42 -0500

```Nick Cox wrote

>>If three variables say x, y, z add to 1 then x + y + z = 1 defines a
plane in 3-space
and you can lay such a plane flat, i.e. project it onto 2-space
without distortion. That, as everyone knows, is the reason you can draw
a triangular plot (or whatever else it's called). What is about that
which is not Euclidean? I think Euclid
would have felt very much at home with that triangle.<<

The issue is that the Euclidean distance between points in a triangle
plot doesn't say what most people think it says in analogy to an
ordinary scatterplot. That's all I meant.

>>Anyway, all the alternatives I know to that stretch and shrink
different parts of the
space, and none is more intuitive than the original. But some can be
more convenient. <<

The fact that compositional data are dependent due to the sum constraint
makes them strange. Unordered choice data has the exact same problem.
Aitchison provides some ways of dealing with the issue, but only at the
expense of having to look at nasty things like log-ratios.

As an aside, why did you guys kick out simplex corner observations in
the Dirichlet model? These are perfectly valid observations, indeed
quite possibly very informative ones since they say "I spent all my
budget on X". A boundary point is a pain because the likelihood is
undefined there, but the procedure described in one of Tim Fry's
articles (Fry, et al, Modelling Zeroes in Microdata, Applied Economics,
2000, 33, 383-392) avoids the problem and preserves subcompositional
invariance.

Jay

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```