Min did specify a need only to work
with two variables, in which case
official Stata alone should work fine.
sysuse auto, clear
gen touse = (mpg < .) & (weight < .)
egen fractile = rank(weight) if touse
count if touse
replace fractile = (fractile - 0.5) / r(N)
label var fractile "fraction of data"
lowess mpg fractile
or
locpoly mpg fractile
For "mpg" substitute y variable and for
"weight" substitute x variable.
My program was more ambitious and allows
several predictors simultaneously.
Nick
n.j.cox@durham.ac.uk
Nick Cox
> I think I understand this, if for "kernel density
> function" one can substitute "kernel-like smoothing".
>
> The idea seems to go back to Prasanta C. Mahalanobis circa
> 1960. There is a nice example at
>
> http://www.pnas.org/cgi/reprint/0509842103v1.pdf
>
> in which the term "kernel density function" is also
> apparently abused, even though the author is a member of the
> U.S. National Academy of Sciences.
>
> There are lots of slightly different recipes here.
> The easiest way of getting the smooths in Stata
> is probably through -lowess-. (-locpoly- (for which
> use -search- or -findit-) is not much more difficult.)
>
> For fractiles, I wire in (rank - 0.5) / #ranks, but
> see e.g. http://www.stata.com/support/faqs/stat/pcrank.html
> for excruciatingly pedantic discussion of alternatives.
>
> Here is a fudge-kludge for experimentation:
>
> --------------------------------------- fractileplot.ado
> *! NJC 1.0.0 4 Oct 2006
> program fractileplot
> version 8
> syntax varlist(numeric) [if] [in] [, lowess(str asis) *]
>
> marksample touse
> qui count if `touse'
> if r(N) == 0 error 2000
>
> local n = r(N)
> tokenize `varlist'
> local y "`1'"
> local Y : variable label `y'
> if `"`Y'"' == "" local Y "`y'"
> mac shift
> local x "`*'"
> local menu "solid dash dot dash_dot shortdash"
> local menu "`menu' shortdash_dot longdash longdash_dot"
> tokenize `menu'
>
> local j = 1
> qui foreach v of local x {
> tempvar f
> egen `f' = rank(`v') if `touse'
> replace `f' = (`f' - 0.5) / `n'
> local J = mod(`j',9)
> local call "`call' lowess `y' `f', lp(``J'')
> `lowess' ||"
> local V : variable label `v'
> if `"`V'"' == "" local V "`v'"
> local order `order' `j' `"`V'"'
> local ++j
> }
>
> twoway `call', ///
> legend(order(`order')) ytitle(`"`Y'"') ///
> xtitle(fraction of data) `options'
> end
> ----------------------------------------- cut here
>
>
> sysuse auto, clear
>
> after which
>
> fractileplot mpg weight length displacement, yla(, ang(h))
>
> works quite (vlw: American sense, not British) nicely.
>
> fractileplot mpg weight length displacement, lowess(bw(0.2))
> yla(, ang(h))
>
> shows how you can tune the smoothing, in this case to ill effect.
>
> The syntax is regression-like: the first variable is the response;
> others are predictors, and each relationship is smoothed and
> shown in relation not to values of predictor, but to cumulative
> probabilities. Thus different bivariate relationships can be
> shown on the same scale.
Min
> > I want a fractile plot over two variables. I would like to
> > see my primary
> > variable on Y-axis and fractile on the X-axis, and a line in
> > the graph to fit
> > a kernel density function line representing the bivariate
> > relationship between
> > Y and my other variable.
> > Clear enough?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/