Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to plot cdf after corrected kernel density


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: How to plot cdf after corrected kernel density
Date   Fri, 4 Oct 2013 11:03:15 +0100

-akdensity- from  Philippe Van Kerm  (SJ) is an excellent command, but
I don't see options to respect lower and upper bounds, as Monica's
problem evidently requires. Philippe will correct me if I am wrong.

However, her post does not dwell on this aspect and she uses an
accessible example (-mpg- in the auto dataset), for which this problem
does not bite.

In practice, -akdensity- appears to produce estimates for the density
for a range wider than the observed data, so that might entail
projecting beyond the natural support of the data.

The advice here depends a little on what the aim is, which could range
from just wanting a nicer graph for display (because you don't trust
the irregularities that are visible) to wanting numerical estimates
too for some later purpose.

Clearly there is no such thing as "the" smoothed cdf, as it is easy to
think of several ways to get a cdf, either directly or indirectly.

Also, for most purposes it would be expected that you might have to
explain how you got a smoothed cdf. In principle, naturally, the cdf
is just the integral of the pdf, but any method that is smart about
calculating the pdf but crude about integrating it may not be optimal.

I am fond of kernel density methods and often use them, but their
emergence as a default or standard method seems a little accidental.
As they are essentially local methods, they don't place a high premium
(or indeed any at all) on global smoothness. For visualization they
can be a little conservative which is usually an excellent thing, as
researchers should always be on the lookout for quirky details of
their distributions.

Other methods (including logspline density estimation) work well, but
on a quick search I can't find a Stata implementation.

All that said, I still prefer estimating quantiles; it's really the
same problem, as graphically you are just exchanging axes.

Nick
[email protected]


On 3 October 2013 23:20, Alfonso S <[email protected]> wrote:

> I suggest you download the package akdensity (st0037_3). It does an adaptive kernel density and generates the cdf variable as well. Use the code below to check it out.
>
> sysuse auto
> akdensity mpg, g(a b) cdf(cb)
> line cb a
>
> Let me know if that is what you were looking for.

From: Nick Cox <[email protected]>

> The bottom line in the post you cite advises
>
> "I prefer to get smoother cumulative distribution functions directly from
> estimated quantiles."
>
> I agree with that.

On 3 October 2013 21:45, Jain, Monica (HarvestPlus) <[email protected]> wrote:

>> I am using -kdens- and I do not know how to plot the cumulative distribution function. I am using Stata 13 for Windows.
>>
>> I am using -kdens- to estimate kernel density correcting for bounded variables using linear combination method. I want to plot the cumulative distribution function for the estimated kernel densities. On one of the statlist threads (http://www.stata.com/statalist/archive/2005-04/msg00798.html), the following method has been suggested to plot them:
>>
>> sysuse auto
>> _kdens mpg, g(b a)
>> cumul b, g(cb)
>> line cb b, sort
>>
>> With the above command, I get the densities on the x-axis, rather than the [x]. I looked all over the web to check if I can find how to do it, but I have not been successful. If I use the following command:
>>
>> line cb a, sort
>>
>> I get weird triangle shaped graph.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index