Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: kdensity


From   adiallo5@worldbank.org
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: kdensity
Date   Mon, 25 Apr 2005 16:45:38 -0400

Nick,
You're right. The smoothing process is due
to kdensity, and I didn't mean cumul.
By "cumul", I intended my method of using
kdensity and cumul commands.
I also agree with the sort option, in line command.
Finally, I think both methods respond to Branko's question.
displot, that I just discover, is a useful and fast way to do
this and I will use it in the future for quick inference on the
distribution of variables.
Again, apology for any confusion.
Amadou DIALLO.




                                                                                                                                               
                      "Nick Cox"                                                                                                               
                      <n.j.cox@durham.ac.uk>           To:       <statalist@hsphsun2.harvard.edu>                                              
                      Sent by:                         cc:                                                                                     
                      owner-statalist@hsphsun2.        Subject:  RE: st: kdensity                                                              
                      harvard.edu                                                                                                              
                                                                                                                                               
                                                                                                                                               
                      04/25/2005 04:07 PM                                                                                                      
                      Please respond to                                                                                                        
                      statalist                                                                                                                
                                                                                                                                               
                                                                                                                                               
                                                                                                                                               




Let's compare like with like.

-cumul- (official Stata) produces a cumulative distribution,
leaves it in memory as a variable,
but does not plot it.

-distplot- [sic] (SJ) produces a cumulative
distribution on the fly, and does plot it.
(It can do more than that, which makes it
more useful, but that is a side issue.)

The underlying calculations are the same.

If we take your example, suppress the -nograph-
and insert the -sort- which is needed, then
it is clearer what is going on:

sysuse auto

(1)
kdensity mpg, g(a b)
cumul b, g(cb)
line cb b, sort

(2)
distplot line mpg

What you are doing is comparing

(1) the integral of a smoothed density function

(2) a unsmoothed cumulative distribution function.

(1) is indeed smoother than (2). It would be
surprising if it were not. But this is nothing
to do with -cumul- and everything to do with
what you did with -kdensity-.

That said, I prefer to get smoother cumulative
distribution functions directly from
estimated quantiles.

Nick
n.j.cox@durham.ac.uk

adiallo5@worldbank.org

> The cumul command provides a smoother plot than displot.
>
> e.g.:
>
> sysuse auto
> kdensity mpg, g(a b) nograph
> cumul b, g(cb)
> line cb b
> displot line mpg

Nick Cox

> If you want a plot of a (?smoothed) distribution function,
> this is at best a rather indirect route.
>
> Note first that -distplot- is a program dedicated
> to plotting distribution functions. -search distplot-
> points to locations. It is smart enough that you can
> go directly to something like
>
> . distplot line Y X
>
> without doing overlays.
>
> If the results are not smooth enough, an alternative is
> to base a plot on estimated rather than observed quantiles.
>
> One command for quantile estimation is -hdquantile-
> from SSC.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Amadou Diallo
>
> > I used to:
> > kdensity X, gen(aa bb) nogr
> > cumul bb, g(cum_bb)
> > ksm cum_bb bb
> > ?
> > If you want the density at each point
> > you could :
> > qui cou
> > local n = r(N)
> > kdensity X, gen(aa bb) nogr n(`n')
> > etc...
>
> Branko Milanovic
>
> > When you do kdensity X, STATA charts a kernel density fct
> of X's. Now,
> > is there a command that would allow me to take the density
> > function thus
> > generated and chart a cumulative density (or distribution) function?
> > Ideally, I would like to do that for both densities, that is
> > to go from a overlay graph
> >
> > twoway (kdensity X) (kdensity Y)
> >
> > To a similar overlay graph of two cumulative density functions.
> >
> > Or is the only way to use:
> >
> > kdensity X, gen(aa bb)
> >
> > And then generate a cumulative function of aa? By the way, I
> > tried that but the graph did not turn out well.


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index