Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Proportional Independent Variables |

Date |
Thu, 28 Feb 2013 09:19:07 +0000 |

Joerg's suggestions are naturally all good. I might add some general points. 1. With such predictors I'd see no obligation to keep them on the original measured scale. In many problems components with small mean proportions can be diagnostic of something important. 2. For different reasons log and logit transformations might be considered. There is a very inward-looking literature on compositional data analysis centred on more exotic transformations tailored to the problem. The reference I gave earlier is one entry into that. 3. The two previous points are often complicated by measured zeros. There is then a long slow agony about whether they are structural or sampling zeros and what to do about them. The more components are measured, the worse this usually gets, whether it is a fractions of a budget spent on different things, or proportions of a material by elements or compounds or particle size classes, or whatever. Nick On Thu, Feb 28, 2013 at 8:22 AM, Nick Cox <njcoxstata@gmail.com> wrote: > I should have said use 19 at most. > > Nick > > On Wed, Feb 27, 2013 at 10:12 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> In principle, yes. In practice, the effect might be slight. You could >> look at e.g. >> >> http://www.amazon.co.uk/Compositional-Data-Analysis-Theory-Applications/dp/0470711353/ >> >> for ideas on transformations that tackle this issue. My guess is that >> you will lose more on interpretability than you will gain. But use 19 >> not 20. >> >> Nick >> >> On Wed, Feb 27, 2013 at 8:40 PM, nick bungy >> <nickbungystata@hotmail.co.uk> wrote: >>> Dear Statalist, >>> >>> I have a dependent variable that is continuous >>> and a set of 20 independent variables that are percentage based, with >>> the condition that the sum of these variables must be 100% across each >>> observation. The data is across section only. >>> >>> I am aware that >>> interpretting the coefficients from a general OLS fit will be >>> inaccurate. The increase of one of the 20 variables will have to be >>> facilitated by a decrease in one or more of the other 19 variables. >>> >>> Is >>> there an approach to get consistent coefficient estimates of these >>> parameters that consider the influence of a proportionate decrease in >>> one or more of the other 20 variables? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Proportional Independent Variables***From:*"JVerkuilen (Gmail)" <jvverkuilen@gmail.com>

**References**:**st: Proportional Independent Variables***From:*nick bungy <nickbungystata@hotmail.co.uk>

**Re: st: Proportional Independent Variables***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Proportional Independent Variables***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Purchasing Stata version update** - Next by Date:
**Re: st: Another question regarding string variables** - Previous by thread:
**Re: st: Proportional Independent Variables** - Next by thread:
**Re: st: Proportional Independent Variables** - Index(es):