## Stata 15 help for kdensity

```
[R] kdensity -- Univariate kernel density estimation

Syntax

kdensity varname [if] [in] [weight] [, options]

options                        Description
-------------------------------------------------------------------------
Main
kernel(kernel)               specify kernel function; default is
kernel(epanechnikov)
bwidth(#)                    half-width of kernel
generate(newvar_x newvar_d)  store the estimation points in newvar_x
and the density estimate in newvar_d
n(#)                         estimate density using # points; default
is min(N, 50)
at(var_x)                    estimate density using the values
specified by var_x
nograph                      suppress graph

Kernel plot
cline_options                affect rendition of the plotted kernel
density estimate

Density plots
normal                       add normal density to the graph
normopts(cline_options)      affect rendition of normal density
student(#)                   add Student's t density with # degrees of
freedom to the graph
stopts(cline_options)        affect rendition of the Student's t
density

Y axis, X axis, Titles, Legend, Overall
twoway_options               any options other than by() documented in
[G-3] twoway_options
-------------------------------------------------------------------------

kernel                         Description
-------------------------------------------------------------------------
epanechnikov                   Epanechnikov kernel function; the default
epan2                          alternative Epanechnikov kernel function
biweight                       biweight kernel function
cosine                         cosine trace kernel function
gaussian                       Gaussian kernel function
parzen                         Parzen kernel function
rectangle                      rectangle kernel function
triangle                       triangle kernel function
-------------------------------------------------------------------------

fweights, aweights, and iweights are allowed; see weight.

Statistics > Nonparametric analysis > Kernel density estimation

Description

kdensity produces kernel density estimates and graphs the result.

Options

+------+
----+ Main +-------------------------------------------------------------

kernel(kernel) specifies the kernel function for use in calculating the
kernel density estimate.  The default kernel is the Epanechnikov
kernel (epanechnikov).

bwidth(#) specifies the half-width of the kernel, the width of the
density window around each point.  If bwidth() is not specified, the
"optimal" width is calculated and used; see [R] kdensity.  The
optimal width is the width that would minimize the mean integrated
squared error if the data were Gaussian and a Gaussian kernel were
used, so it is not optimal in any global sense.  In fact, for
multimodal and highly skewed densities, this width is usually too
wide and oversmooths the density (Silverman 1986).

generate(newvar_x newvar_d) stores the results of the estimation.
newvar_x will contain the points at which the density is estimated.
newvar_d will contain the density estimate.

n(#) specifies the number of points at which the density estimate is to
be evaluated.  The default is min(N,50), where N is the number of
observations in memory.

at(var_x) specifies a variable that contains the values at which the
density should be estimated.  This option allows you to more easily
obtain density estimates for different variables or different
subsamples of a variable and then overlay the estimated densities for
comparison.

nograph suppresses the graph.  This option is often used with the
generate() option.

+-------------+
----+ Kernel plot +------------------------------------------------------

cline_options affect the rendition of the plotted kernel density
estimate. See [G-3] cline_options.

+---------------+
----+ Density plots +----------------------------------------------------

normal requests that a normal density be overlaid on the density estimate
for comparison.

normopts(cline_options) specifies details about the rendition of the
normal curve, such as the color and style of line used. See [G-3]
cline_options.

student(#) specifies that a Student's t density with # degrees of freedom
be overlaid on the density estimate for comparison.

stopts(cline_options) affects the rendition of the Student's t density.
See [G-3] cline_options.

+-----------+

addplot(plot) provides a way to add other plots to the generated graph.

+-----------------------------------------+
----+ Y axis, X axis, Titles, Legend, Overall +--------------------------

twoway_options are any of the options documented in [G-3] twoway_options,
excluding by().  These include options for titling the graph (see
[G-3] title_options) and for saving the graph to disk (see [G-3]
saving_option).

Examples

Setup
. sysuse auto

Graph kernel density estimates for length
. kdensity length

Same as above, but use 20 for the half-width of the kernel
. kdensity length, bw(20)

Obtain kernel density estimates for weight using the Parzen kernel
function, store these results in x2, and suppress the graph
. kdensity weight, kernel(parzen) gen(x2 parzen) nograph

Stored results

kdensity stores the following in r():

Scalars
r(bwidth)      kernel bandwidth
r(n)           number of points at which the estimate was evaluated
r(scale)       density bin width

Macros
r(kernel)      name of kernel

Reference

Silverman, B. W. 1986.  Density Estimation for Statistics and Data
Analysis.  London: Chapman & Hall.

```