Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Proportional Independent Variables

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: Proportional Independent Variables
Date	Thu, 28 Feb 2013 09:19:07 +0000

Joerg's suggestions are naturally all good.

I might add some general points.

1. With such predictors I'd see no obligation to keep them on the
original measured scale. In many problems components with small mean
proportions can be diagnostic of something important.

2. For different reasons log and logit transformations might be
considered. There is a very inward-looking literature on compositional
data analysis centred on more exotic transformations tailored to the
problem. The reference I gave earlier is one entry into that.

3. The two previous points are often complicated by measured zeros.
There is then a long slow agony about whether they are structural or
sampling zeros and what to do about them. The more components are
measured, the worse this usually gets, whether it is a fractions of a
budget spent on different things, or proportions of a material by
elements or compounds or particle size classes, or whatever.

Nick

On Thu, Feb 28, 2013 at 8:22 AM, Nick Cox <[email protected]> wrote:
> I should have said use 19 at most.
>
> Nick
>
> On Wed, Feb 27, 2013 at 10:12 PM, Nick Cox <[email protected]> wrote:
>> In principle, yes. In practice, the effect might be slight. You could
>> look at e.g.
>>
>> http://www.amazon.co.uk/Compositional-Data-Analysis-Theory-Applications/dp/0470711353/
>>
>> for ideas on transformations that tackle this issue. My guess is that
>> you will lose more on interpretability than you will gain. But use 19
>> not 20.
>>
>> Nick
>>
>> On Wed, Feb 27, 2013 at 8:40 PM, nick bungy
>> <[email protected]> wrote:
>>> Dear Statalist,
>>>
>>> I have a dependent variable that is continuous
>>> and a set of 20 independent variables that are percentage based, with
>>> the condition that the sum of these variables must be 100% across each
>>> observation. The data is across section only.
>>>
>>> I am aware that
>>> interpretting the coefficients from a general OLS fit will be
>>> inaccurate. The increase of one of the 20 variables will have to be
>>> facilitated by a decrease in one or more of the other 19 variables.
>>>
>>> Is
>>>  there an approach to get consistent coefficient estimates of these
>>> parameters that consider the influence of a proportionate decrease in
>>> one or more of the other 20 variables?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Proportional Independent Variables
  - From: "JVerkuilen (Gmail)" <[email protected]>

References:
- st: Proportional Independent Variables
  - From: nick bungy <[email protected]>
- Re: st: Proportional Independent Variables
  - From: Nick Cox <[email protected]>
- Re: st: Proportional Independent Variables
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Purchasing Stata version update
Next by Date: Re: st: Another question regarding string variables
Previous by thread: Re: st: Proportional Independent Variables
Next by thread: Re: st: Proportional Independent Variables
Index(es):
- Date
- Thread