Stata 15 help for kdensity

[R] kdensity -- Univariate kernel density estimation


kdensity varname [if] [in] [weight] [, options]

options Description ------------------------------------------------------------------------- Main kernel(kernel) specify kernel function; default is kernel(epanechnikov) bwidth(#) half-width of kernel generate(newvar_x newvar_d) store the estimation points in newvar_x and the density estimate in newvar_d n(#) estimate density using # points; default is min(N, 50) at(var_x) estimate density using the values specified by var_x nograph suppress graph

Kernel plot cline_options affect rendition of the plotted kernel density estimate

Density plots normal add normal density to the graph normopts(cline_options) affect rendition of normal density student(#) add Student's t density with # degrees of freedom to the graph stopts(cline_options) affect rendition of the Student's t density

Add plots addplot(plot) add other plots to the generated graph

Y axis, X axis, Titles, Legend, Overall twoway_options any options other than by() documented in [G-3] twoway_options -------------------------------------------------------------------------

kernel Description ------------------------------------------------------------------------- epanechnikov Epanechnikov kernel function; the default epan2 alternative Epanechnikov kernel function biweight biweight kernel function cosine cosine trace kernel function gaussian Gaussian kernel function parzen Parzen kernel function rectangle rectangle kernel function triangle triangle kernel function -------------------------------------------------------------------------

fweights, aweights, and iweights are allowed; see weight.


Statistics > Nonparametric analysis > Kernel density estimation


kdensity produces kernel density estimates and graphs the result.


+------+ ----+ Main +-------------------------------------------------------------

kernel(kernel) specifies the kernel function for use in calculating the kernel density estimate. The default kernel is the Epanechnikov kernel (epanechnikov).

bwidth(#) specifies the half-width of the kernel, the width of the density window around each point. If bwidth() is not specified, the "optimal" width is calculated and used; see [R] kdensity. The optimal width is the width that would minimize the mean integrated squared error if the data were Gaussian and a Gaussian kernel were used, so it is not optimal in any global sense. In fact, for multimodal and highly skewed densities, this width is usually too wide and oversmooths the density (Silverman 1986).

generate(newvar_x newvar_d) stores the results of the estimation. newvar_x will contain the points at which the density is estimated. newvar_d will contain the density estimate.

n(#) specifies the number of points at which the density estimate is to be evaluated. The default is min(N,50), where N is the number of observations in memory.

at(var_x) specifies a variable that contains the values at which the density should be estimated. This option allows you to more easily obtain density estimates for different variables or different subsamples of a variable and then overlay the estimated densities for comparison.

nograph suppresses the graph. This option is often used with the generate() option.

+-------------+ ----+ Kernel plot +------------------------------------------------------

cline_options affect the rendition of the plotted kernel density estimate. See [G-3] cline_options.

+---------------+ ----+ Density plots +----------------------------------------------------

normal requests that a normal density be overlaid on the density estimate for comparison.

normopts(cline_options) specifies details about the rendition of the normal curve, such as the color and style of line used. See [G-3] cline_options.

student(#) specifies that a Student's t density with # degrees of freedom be overlaid on the density estimate for comparison.

stopts(cline_options) affects the rendition of the Student's t density. See [G-3] cline_options.

+-----------+ ----+ Add plots +--------------------------------------------------------

addplot(plot) provides a way to add other plots to the generated graph. See [G-3] addplot_option.

+-----------------------------------------+ ----+ Y axis, X axis, Titles, Legend, Overall +--------------------------

twoway_options are any of the options documented in [G-3] twoway_options, excluding by(). These include options for titling the graph (see [G-3] title_options) and for saving the graph to disk (see [G-3] saving_option).


Setup . sysuse auto

Graph kernel density estimates for length . kdensity length

Same as above, but use 20 for the half-width of the kernel . kdensity length, bw(20)

Obtain kernel density estimates for weight using the Parzen kernel function, store these results in x2, and suppress the graph . kdensity weight, kernel(parzen) gen(x2 parzen) nograph

Stored results

kdensity stores the following in r():

Scalars r(bwidth) kernel bandwidth r(n) number of points at which the estimate was evaluated r(scale) density bin width

Macros r(kernel) name of kernel


Silverman, B. W. 1986. Density Estimation for Statistics and Data Analysis. London: Chapman & Hall.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index