Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: strange kdensity behavior

From   Rebecca Pope <>
Subject   st: strange kdensity behavior
Date   Fri, 31 May 2013 11:15:48 -0500

Hi all,
I'm seeing something strange when I'm using -kdensity- and I'm
wondering if anyone else can duplicate this.

When I plot a variable using -kdensity-, I get densities that do not
make sense. However, if I graph that _same_ variable with the
-addplot((kdensity))- option, the densities are correct. Here is an

clear all

set seed 72114
set obs 10000

local mu = 0
local sig = 1
di "Mode of Lognormal is " %5.4f exp(`mu'-`sig'^2)
di "Mean of Lognormal is " %5.4f exp(`mu'+`sig'^2/2)
di "Variance of Lognormal is " %5.4f exp(2*`mu'+`sig'^2)*(exp(`sig'^2)-1)

/* pdf of Lognormal is f(x) =
1/(x*sqrt(2*pi*sigma^2))*exp(-((ln(x)-mu)^2)/(2*sigma^2)) */
local x = exp(`mu'-`sig'^2) /* density reaches maximum at modal value */
di "max f(x) = " %4.3f

/* use runiform() to get a random p-value */
/* use invnormal() to convert random p-value to a standard z-score */
/* make adjustments so that X is from ~N(mu,sigma^2) rather than
standard normal */

gen lognorm = exp(`mu'+`sig'*(invnormal(runiform())))
label var lognorm "Lognormal: mu=`mu', sigma=`sig'"

kdensity lognorm, lwidth(thick) addplot((kdensity lognorm, lpattern(dash))) ///
legend(order(1 "Default Plot" 2 "Added Plot"))

When I run this, the first plot peaks about 0.45 and the second peaks
about 0.6, which is much closer to the true maximum of the pdf with
these parameters.

As an add-on question, is there a way to "zoom in" on the lower end of
the plot? If I use -if-, -kdensity- calculates the density over that
range (which I don't want).
To see this:
kdensity lognorm if lognorm<=10, addplot((kdensity lognorm if
lognorm<=5, lpattern(dash))) ///
legend(order(1 "Default Plot" 2 "Added Plot")) name(plot2)

You'll notice that the greatest change is in Default Plot.

If I use the x-axis options, it just changes the labeling without
excluding the higher values of X.

I'm running Stata 12.1 on Windows 7. Stata is updated to 20 Mar 13 &
when I checked for updates, nothing new showed up.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index