[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: computation of normal probabilities at extreme values

From	"David M. Drukker, StataCorp" <[email protected]>
To	[email protected]
Subject	RE: st: computation of normal probabilities at extreme values
Date	Mon, 26 Sep 2005 16:06:34 -0500
Stephen P. Jenkins <[email protected]> asked a pair of questions about
computing normal probabilities at extreme values.  The questions were
inspired by Zeileis and Kleiber (2005).  (A working paper version is
available at
http://epub.wu-wien.ac.at/dyn/virlib/wp/mediate/epub-wu-01_7c5.pdf?ID=epub-wu-01_7c5 )

Zeileis and Kleiber encountered some problems with how Gauss computed the
normal probability for very small numbers.  (Zeileis and Kleiber report that
the problem was subsequently fixed.)  Comparing the results from the do-file
that appears below my signature with those in table II on page 6 of the
working paper demonstrates that Stata's normal() function has no such
problem.

Stata's normal(y) function computes accurate values for very small y and
returns zero when y is so small that normal(y) will not fit into a double
precision number.

So the answer to Stephen's first question,

>         are these issues concerning computation at extreme values of
>         the standard normal CDF handled gracefully in Stata (cf.
>         function -normal(z)- in Stata 9)?

is yes, Stata handles extreme values gracefully.

One can compute a function that directly approximates the ln(normal(y)) for
much larger absolute values of y than for which one can compute the natural
log of function that approximates normal(y).  Zeileis and Kleiber argue that
they have an application in which the larger domain is of practical
importance.

So Stephen asked,

>         is there much to be gained, from a computational accuracy
>         point of view, from having a (new) Stata function that would
>         compute "normal log probabilities" directly?  [In -ml-
>         evaluation programs, I often compute normal probabilities
>         and then take logs to get a likelihood contribution term.]

Although the difference in domain has no practical importance in most
applications, we find Zeileis and Kleiber's case very interesting and we
have been looking into adding a function that directly approximates
ln(normal(y)).


        --David
          [email protected]



References 
----------
Achim Zeileis and Christian Kleiber.  (2005).  "Validating multiple
structural change models -- a case study" Journal of Applied Econometrics,
20: 685-690. 


---------------------------------- BEGIN --- n.do --- CUT HERE -------
clear
set obs 21
gen double  x = 80 + (_n-1)/2
gen double  y = -4.08*sqrt(x)
gen double p = normal(y)
gen double lnp = ln(p)
list , sep(1)
---------------------------------- End   --- n.do --- CUT HERE -------

---------------------------------- BEGIN --- n.log --- CUT HERE -------
. do n

. clear

. 
. set obs 21
obs was 0, now 21

. gen double  x = 80 + (_n-1)/2

. gen double  y = -4.08*sqrt(x)

. gen double p = normal(y)

. gen double lnp = ln(p)
(10 missing values generated)

. list , sep(1)

     +--------------------------------------------+
     |    x            y           p          lnp |
     |--------------------------------------------|
  1. |   80   -36.492629   7.26e-292    -670.3728 |
     |--------------------------------------------|
  2. | 80.5   -36.606491   1.13e-293   -674.53751 |
     |--------------------------------------------|
  3. |   81       -36.72   1.75e-295    -678.7022 |
     |--------------------------------------------|
  4. | 81.5   -36.833159   2.72e-297   -682.86687 |
     |--------------------------------------------|
  5. |   82   -36.945971   4.23e-299   -687.03153 |
     |--------------------------------------------|
  6. | 82.5    -37.05844   6.57e-301   -691.19616 |
     |--------------------------------------------|
  7. |   83   -37.170569   1.02e-302   -695.36078 |
     |--------------------------------------------|
  8. | 83.5    -37.28236   1.58e-304   -699.52538 |
     |--------------------------------------------|
  9. |   84   -37.393818   2.46e-306   -703.68996 |
     |--------------------------------------------|
 10. | 84.5   -37.504944   3.83e-308   -707.85452 |
     |--------------------------------------------|
 11. |   85   -37.615741   5.94e-310   -712.01907 |
     |--------------------------------------------|
 12. | 85.5   -37.726214           0            . |
     |--------------------------------------------|
 13. |   86   -37.836363           0            . |
     |--------------------------------------------|
 14. | 86.5   -37.946193           0            . |
     |--------------------------------------------|
 15. |   87   -38.055707           0            . |
     |--------------------------------------------|
 16. | 87.5   -38.164905           0            . |
     |--------------------------------------------|
 17. |   88   -38.273793           0            . |
     |--------------------------------------------|
 18. | 88.5   -38.382371           0            . |
     |--------------------------------------------|
 19. |   89   -38.490643           0            . |
     |--------------------------------------------|
 20. | 89.5   -38.598611           0            . |
     |--------------------------------------------|
 21. |   90   -38.706279           0            . |
     +--------------------------------------------+
---------------------------------- End   --- n.log --- CUT HERE -------


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: st: RE: Labelling part of graph axis?
Next by Date: Re: st: standard error of the odds ratio
Previous by thread: st: computation of normal probabilities at extreme values
Next by thread: st: RE: ordering dates
Index(es):
- Date
- Thread