Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: RE: How to test whether data follows Exp distribution?

 From "Nick Cox" To Subject st: RE: How to test whether data follows Exp distribution? Date Wed, 7 Jul 2010 11:33:18 +0100

```You're meant to knit your own alternative cumulative for the right-hand
side. The
equivalent would be

ksmirnov x = 1 - exp(-x/r(mean))

These tests loom large in mathematical statistics texts. (The prestige
of Kolmogorov as one of the giants of probability theory and the
generality and elegance of the underlying idea have, I guess, not
hindered their survival from text to text.) But in my view they are not
much use in practical data analysis:

1. Using parameters estimated from the data, as is typical, has worried
some statisticians in the past. The orthodox calculation presumes that
parameter values are somehow known. The manual entry makes light of
this, but it should be mentioned.

2. More importantly, and as the manual entry does make clear, these
tests are not much use for picking up deviations in the tails. (Observed
and expected cumulatives necessarily both converge to 0 and 1 in the two
tails.) For work with distributions like the exponential, what is going
on in the far tail is very likely to be of great concern both
scientifically and statistically.

3. A test result does not indicate exactly what is going on. Knowing the
reason for rejection -- or of failure to reject -- will be of more
guidance to your data analysis than getting a P-value. Graphs are
critical here, as Maarten flags.

There are plenty of alternatives, however. In addition to Maarten's
-hangroot-,

1. -qexp- and -pexp- from SSC offer canned Q-Q and P-P plots for the
exponential. Note that

SJ-7-2  gr0027  . .  Stata tip 47: Quantile-quantile plots without
programming
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
Q2/07   SJ 7(2):275--279                                 (no
commands)
tip on producing various quantile-quantile (Q-Q) plots

is now available to all, regardless of whether you subscribe to the SJ,
given the SJ's 3-year moving window. This short paper explains the logic
of Q-Q plots, gives references and includes the exponential as one of
its examples.

To help assess the lack of fit, you can easily produce a portfolio of
plots for random samples of the same size from an exponential:

sysuse auto, clear

qexp price, saving(price)

forval i = 1/24 {
gen exp`i' = -ln(runiform())
qexp exp`i', saving(g`i')
local names `names' "g`i'"
}

graph combine "price" `names'

2. -dpplot- (SJ) is another graphical approach.

3. You can -stset- your variable as if it were a survival time and
follow with -streg, d(e)- specifying just the response, and no
predictors. The information given bears indirectly on the question, but
this is a formal test of exponentiality, as I understand it. Survival
experts will be able to expand (or to rebut).

Nick
n.j.cox@durham.ac.uk

Maarten L. Buis

You can use -hangroot- to check an empirical distribution against, among
others, an exponential distribution. To install it type in Stata -ssc
install hangroot-.

Jabr, Wael M

I am trying to find if the variable I have follows an exponential
distribution. Tried to locate some goodness of fit tests but wasn't
successful.
After some long search I found the command ksmirnov. However, the help
doesn't offer much on how to use it. They have an illustration for
testing if a variable x follows normal distribution.

ksmirnov x = normal((x-r(mean))/r(sd))

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```