Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Likelihood function of uniform distribution


From   "Verkuilen, Jay" <JVerkuilen@gc.cuny.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Likelihood function of uniform distribution
Date   Thu, 3 Apr 2008 14:03:25 -0400

>>Thank you for the citations. As Jay mentioned, my problem is an ordinary
binary regression model. In case of probit, for example, you need to use the
Normal distribution function to define your likelihood evaluator. Since the
normal distribution function is already defined in Stata you can simply use
it in your likelihood evaluator.<<<

No, the Normal CDF is not the likelihood in probit regression. It's the link function. It is what links probabilities (expected values) to the Bernoulli likelihood. 

I highly recommend looking up the development of binary regression in a reference such as J. S. Long (1998), Regression Models for Categorical and Limited Dependent Variables, or a standard econometrics textbook such as Cameron and Trivedi (2005), Microeconometrics. 

http://en.wikipedia.org/wiki/Generalized_linear_model
http://en.wikipedia.org/wiki/Logistic_regression
http://en.wikipedia.org/wiki/Probit_model


>>In my case, however, the difficulty is that I don't know how to define the
necessary distribution function, i.e., the uniform distribution function.
More specifically, I need to first define the following function:

f(p)=1 if 0<p<1
    =0 otherwise.

Simply, my question is how one can define the above function (or other
functions such as a triangular pdf) in Stata. <<

You just wrote the PDF of the uniform. The CDF is what you want for a link function (most likely). In the case of the uniform, the CDF, F(p) = p, which means that you have an identity link. There's nothing for you to do at all---just use GLM with the identity link and the binomial distribution or BINREG with risk differences. You could also simply subject your binary responses to ordinary regression using REGRESS. 

This is probably NOT what you really want and that I'm betting you'll come back to logit or probit eventually. The problem with using the identity link is that it does not respect the boundary of the sample space. See the previously mentioned references.  The linear function inside the link is unbounded but the expected value of a binary random variable must be within the unit interval. 

I can't imagine why one would want to use a triangular distribution as the basis for a link function---it has all the problems of the identity link and no solid theoretical rationale in terms of risk differences which make people hold their noses and use the identity link---but in that case the CDF is gotten by integrating the PDF of the triangular distribution. See: http://en.wikipedia.org/wiki/Triangular_distribution

Jay

<<winmail.dat>>




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index