[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: interpreting output of kdensity command |

Date |
Thu, 7 Aug 2003 15:25:18 +0100 |

Kimberley Tran > > To build kernel density graphs in Stata, I created a > Do-file for purpose of > generating a variable within which density measures are > taken. This variable > contained 100 points. From my understanding, the distance > between each point > is the bandwidth. I ran this Do-file prior to using the > kdensity command. > In the resulting kernel density graphs, there are points > on the y-axis which > are greater than 1. How should the y-axis of the resulting > kernel density > graphs be interpreted? Is it the frequency of the distribution? First off, grid mesh is not the same as bandwidth. -kdensity- produces a smoothed estimate of the probability density function. The units of probability density are the reciprocal of the units of the variable whose distribution you are examining. If that variable is measured in metres, the units are 1 / m; if in years, the units are 1 / yr. The density cannot be negative; otherwise there is a constraint that the area under the probability density function should integrate to 1. It is perfectly possible for individual ordinates to exceed 1. For example, . use auto . gen gpm = 1 / mpg . kdensity gpm I see a density estimate which averages about 15 for a range of about 0.09 - 0.02 = 0.07. Roughly, 15 * 0.07 is about 1, and I am confident that a closer estimate would be nearer 1. (There is usually some small loss in the extreme tails with default choices.) The units of the density are 1 / gallons per mile OR miles per gallon and the units of the variable are by construction gallons per mile Area under the curve has no units, as can be seen by cancelling down miles gallons ----- * ------- gallons miles There is a note on this at [R] p.227. David Finney wrote a very nice paper on "Dimensions in statistics" in Applied Statistics 25, 285-289 (1977). Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: interpreting output of kdensity command***From:*Roger Harbord <roger.harbord@bristol.ac.uk>

**References**:**st: interpreting output of kdensity command***From:*Kimberley Tran <ktran2@dal.ca>

- Prev by Date:
**st: interim analysis in epidemiologic studies** - Next by Date:
**st: histogramm and dates** - Previous by thread:
**st: interpreting output of kdensity command** - Next by thread:
**Re: st: RE: interpreting output of kdensity command** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |