# RE: st: kdensity

 From "Nick Cox" To Subject RE: st: kdensity Date Mon, 25 Apr 2005 21:07:27 +0100

```Let's compare like with like.

-cumul- (official Stata) produces a cumulative distribution,
leaves it in memory as a variable,
but does not plot it.

-distplot- [sic] (SJ) produces a cumulative
distribution on the fly, and does plot it.
(It can do more than that, which makes it
more useful, but that is a side issue.)

The underlying calculations are the same.

If we take your example, suppress the -nograph-
and insert the -sort- which is needed, then
it is clearer what is going on:

sysuse auto

(1)
kdensity mpg, g(a b)
cumul b, g(cb)
line cb b, sort

(2)
distplot line mpg

What you are doing is comparing

(1) the integral of a smoothed density function

(2) a unsmoothed cumulative distribution function.

(1) is indeed smoother than (2). It would be
surprising if it were not. But this is nothing
to do with -cumul- and everything to do with
what you did with -kdensity-.

That said, I prefer to get smoother cumulative
distribution functions directly from
estimated quantiles.

Nick
n.j.cox@durham.ac.uk

> The cumul command provides a smoother plot than displot.
>
> e.g.:
>
> sysuse auto
> kdensity mpg, g(a b) nograph
> cumul b, g(cb)
> line cb b
> displot line mpg

Nick Cox

> If you want a plot of a (?smoothed) distribution function,
> this is at best a rather indirect route.
>
> Note first that -distplot- is a program dedicated
> to plotting distribution functions. -search distplot-
> points to locations. It is smart enough that you can
> go directly to something like
>
> . distplot line Y X
>
> without doing overlays.
>
> If the results are not smooth enough, an alternative is
> to base a plot on estimated rather than observed quantiles.
>
> One command for quantile estimation is -hdquantile-
> from SSC.
>
> Nick
> n.j.cox@durham.ac.uk
>
>
> > I used to:
> > kdensity X, gen(aa bb) nogr
> > cumul bb, g(cum_bb)
> > ksm cum_bb bb
> > ?
> > If you want the density at each point
> > you could :
> > qui cou
> > local n = r(N)
> > kdensity X, gen(aa bb) nogr n(`n')
> > etc...
>
> Branko Milanovic
>
> > When you do kdensity X, STATA charts a kernel density fct
> of X's. Now,
> > is there a command that would allow me to take the density
> > function thus
> > generated and chart a cumulative density (or distribution) function?
> > Ideally, I would like to do that for both densities, that is
> > to go from a overlay graph
> >
> > twoway (kdensity X) (kdensity Y)
> >
> > To a similar overlay graph of two cumulative density functions.
> >
> > Or is the only way to use:
> >
> > kdensity X, gen(aa bb)
> >
> > And then generate a cumulative function of aa? By the way, I
> > tried that but the graph did not turn out well.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```