Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: aweight option in kdensity


From   "Ben Jann" <ben.jann@soz.gess.ethz.ch>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: aweight option in kdensity
Date   Fri, 15 Sep 2006 22:23:17 +0200

Vora wrote:
> If I do:
> 
> [1]
> .kdensity income
> 
> Then the default kernal is Epanechnikov
> and the default bandwidth is the "optimal"
> bandwidth (silverman).
> 
> ----------------------
> If I do:
> 
> [2]
> .kdens income
> 
> Then the default kernal is Epan2
> and the default bandwidth is silverman.
> [In your paper, equation 28 and equation 29
> are actually the same thing?]
> 
> ----------------------
> So if I want kdens to provide the same graph
> as the default kdensity then I should do:
> 
> [3]
> .kdens income, k(e)
> 
> ----------------------

[Formula 28 is a special case of formula 29. Formula 
28 is formula 29 for the gaussian kernel.]

Graphs [1] and [3] should look different - but not too different. There
are several differences between -kdensity- and -kdens-. Among them are:

1. -kdens- uses binned data whereas -kdensity- uses the raw data. This
causes slight differences in the estimates. 

2. -kdens- uses a version of the "optimal of Silverman" that takes into
account the canonical bandwidth of the kernel function. -kdensity- does
not do this adjustment (it always uses a gaussian scaling). Gaussian and
epanechnikov have different canonical bandwidths, so the bandwidth
estimates used in [1] and [3] will differ (not very much, though).
Here's an example:

. sysuse auto, clear
(1978 Automobile Data)

. kdensity price, nograph

. di r(width)
605.64238

. kdens price, k(epanechnikov) nograph
(n2() set to 74)

. di r(width)
599.61225

. kdens price, k(gaussian) nograph
(n2() set to 74)

. di r(width)
605.64238

3. The default in -kdens- is to return 512 estimation points (or _N if
_N<512). -kdensity- returns 50.

4. The range of estimation points is set differently.

If you feel that these points do not explain the differences in your
graphs then maybe do something like

. kdensity income, gen(at d1) nograph
. kdens income, k(e) bw(`r(width)') at(at) gen(d2) nograph
. line d1 d2 at 

Do the two estimates still look different?

ben

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index