Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Truncating x-axis in kernel density graphs


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Truncating x-axis in kernel density graphs
Date   Thu, 27 Nov 2003 13:52:51 -0000

If I understand you correctly, you can just 
use a two-step of generating the smoothed 
pdf and then graphing for an interval. 
As you imply, you should not estimate the pdf on just 
part of the support. 

Other options in nearby territory: 

1. Transform and work on a transformed scale. 

2. Do the kernel estimation on a log scale 
and back-transform the density 
function estimate to the raw scale. The manipulations 
are easy but a little beyond Stata 8's -kdensity-. 
The calculus and Stata code can be worked out by looking 
within -mdensity- from SSC. The effort is well 
worth it; with highly skewed data it can 
be difficult to choose a kernel width sensible 
both near the mode(s) and in the extreme tail. 
Usually the tail looks lumpier than it should be. 
This has been suggested as an extension of -kdensity-. 
There is no public port of -mdensity- to Stata 8. 

Nick 
n.j.cox@durham.ac.uk 

P.S. "skewed to the right" is I think more common 
terminology when the right-hand tail is long. 
Skewness takes its name from whichever tail is 
longer, not from where the main hump is. 

Ramani Gunatilaka

> I am plotting kernel density functions of per capita 
> household consumption for three years on the same graph.
> All three distributions are skewed to the left and have 
> very long tails extending to the right.
> I would like to truncate the x axis at a certain 
> consumption value so that the long tail is dropped and the 
> area where all the action is, stretched across so that the 
> changes are more visible.
> I have explored the graph scale options and Stata list 
> archives but couldn't find anything that related to my 
> particular problem. 
> If I were to specify a shorter range of consumption values 
> than the entire data set, wouldn't it distort results?
> Would anyone be able to advise of any alternative?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index