Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: analysis of mixture experiments |

Date |
Thu, 23 Sep 2010 12:27:15 +0100 |

You are correct. I am so used to seeing similar questions about response variables that I missed your very clear statement than the problem is on the other side. There is a literature on _compositional data analysis_ that may help. Google that term for some references, including much material on the internet. John Aitchison suggested various transformations for bundles of compositional variables. A while back I wrote Mata code for some, which I don't seem to have made public. Examples follow my signature and may serve at a minimum to show that they are straightforward to compute. John A. Cornell has books on mixtures. Go to the Wiley website and search for "Cornell mixtures". The main problem with most of the multivariate transformation methods I have seen is what to do with observed zeros for any of the components. Much of the compositional data analysis literature deals with geological examples in which it is plausible that an observed zero falls just below some detection limit and that it should be fudged upwards. Most of the examples I have looked at in my own fields of interest are not quite so simple and zeros often appeal to be real (exact, essential, structural, fixed). Nick n.j.cox@durham.ac.uk // compositional data analysis mata : mata drop cda_*() // NJC 1 Sept 2008 // rows scaled to sum to 1 real matrix function cda_closure(real matrix X) { return(X :/ rowsum(X)) } // NJC 1 Sept 2008 // ln(all but last column / last column) real matrix function cda_alr(real matrix X) { real scalar c, cm1 c = cols(X); cm1 = c - 1 return(ln(X[, (1 .. cm1)]) :- ln(X[, c])) } // NJC 1 Sept 2008 // ln(all / row geometric means) real matrix function cda_clr(real matrix X) { return(ln(X) :- mean(ln(X'))') } // NJC 1 Sept 2008 // centring real matrix cda_centre(real matrix X) { real rowvector centre, invcentre centre = cda_closure(exp(mean(ln(X)))) invcentre = cda_closure((1 :/ centre)) return(cda_closure(X :* invcentre)) } // NJC 3 Sept 2008 // column geometric means real matrix cda_colgmean(real matrix X) { return(exp(mean(ln(X)))) } // NJC 3 Sept 2008 // row geometric means real matrix cda_rowgmean(real matrix X) { return(exp(mean(ln(X'))')) } // NJC 2 Sept 2008 // multiplicative replacement for rounded zeros real matrix cda_mrzero(real matrix X, real rowvector delta, | real scalar total) { real matrix iszero if (total == .) total = 1 iszero = X :== 0 return((iszero :* delta) + ((!iszero) :* X :* (1 :- rowsum(iszero :* delta) :/ total))) } // NJC 10 Oct 2008 // isometric log-ratio transformation real matrix function cda_ilr(real matrix X) { real scalar c, j real matrix Y, lnX c = cols(X) Y = X[, (1 .. c - 1)]; lnX = ln(X) for (j = 1; j < c; j++) { Y[, j] = rowsum(lnX[, (1 .. j)]) - j * lnX[, j + 1] Y[, j] = (1 / sqrt(j * (j + 1))) * Y[, j] } return(Y) } end Dan Kahan thanks. I know dirifit; I am very fond of it. But here the proportions are my IVs, not the DV, which is a continuous variable (one to which I would ordinarily fit an OLS linear regression, except that that seems intuitively wrong to me where my IVs are proportions). On Wed, Sep 22, 2010 at 3:20 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > > Look at -dirifit- from SSC. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: analysis of mixture experiments***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: analysis of mixture experiments***From:*Dan Kahan <dmkahan@gmail.com>

**st: RE: analysis of mixture experiments***From:*Nick Cox <n.j.cox@durham.ac.uk>

**Re: st: RE: analysis of mixture experiments***From:*Dan Kahan <dmkahan@gmail.com>

- Prev by Date:
**st: Re: Bootstrapping to get Standard Errors for Regression Discontinuity Estimators** - Next by Date:
**st: Syntax forvalues in Stata 11 SE** - Previous by thread:
**Re: st: RE: analysis of mixture experiments** - Next by thread:
**Re: st: RE: analysis of mixture experiments** - Index(es):