Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: Re: st: Dependent continuous variable with bounded range


From   "Verkuilen, Jay" <[email protected]>
To   <[email protected]>
Subject   RE: Re: st: Dependent continuous variable with bounded range
Date   Thu, 17 Apr 2008 17:09:51 -0400

Nick Cox wrote:

>>
>>I wrote: In point of fact, the variance function of the beta
distribution is the same as the binomial, up to an additional free scale
constant. Both are proportional to E(X)(1-E(X)). You would definitely
want to free up the scale parameter for continuous data, though. <<

Good point. In fact a little thought shows that if a variable is bounded
on [0,1] then as the mean goes to either 0 or 1 the variance must go to
0, because the mean can only approach 0 or 1 
if all values approach 0 or 1. That is true regardless of whether the
variable is discrete or continuous. 
(Same is true for any finite bounds.) <<

The classic leaf blotch data analyzed by Wedderburn used a variance
function E(X)^2 (1 - E(X))^2. There are other distributions in the unit
interval, e.g., Barndorff-Nielsen and Jorgensen's simplex distribution
or the Johnson SB distribution, which have different variance functions.
(The simplex distribution is generated from the inverse Gaussian in the
same basic way the beta is from the gamma; the Johnson SB is similarly
generated from the lognormal.) I am not aware of a distribution with
Wedderburn's variance function, but I suppose one could always
manufacture it. As for finitely bounded data, the two distinguishable
cases are:

(1) Known bounds. In this case, we are free to use these bounds to
rescale to any convenient interval, provided our statistical model does
not make use of the particular values of the boundaries (as nearly all
do not). 

(2) Unknown bounds. In this case, we have a markedly more complex
estimation problem. The article by Nguyen found in Handbook of Beta
Distribution and Its Applications edited by A.K. Gupta and S. Nadarajah
(2004, CRC Press) discusses this situation for the univariate case. It
is not pretty. I'm betting it would get markedly worse as a regression
model. 

Because everything I deal with is in category (1), I've not thought much
about category (2). 

Jay

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index