Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Multivariate kernel regression

From   Josh Hyman <>
Subject   Re: st: Multivariate kernel regression
Date   Fri, 19 Oct 2012 12:20:41 -0400

Thank you so much Austin and Shan.

Shan - I very much appreciate your pointing out the .ado files on
Manski's webpage, in particular kernreg.ado and gridgen.ado . These
will be a great place for me to start, and seem very to be very
similar to what Austin recommended I try starting with. Ideally I
would like to use slightly more than 4 covariates, but this is
terrific for now, and I will see if I can augment the code to accept a
few more.

Austin - Thanks a lot for your suggestions. I met with John DiNardo
recently about this project, but haven't asked him about the
multivariate kernel regression. I sent him an email yesterday to see
if he will discuss this with me. I will begin by coding up your
suggestion to help me understand. Your explanation was very helpful
for me in understanding how the multivariate kernel regression is

Thanks again to you both! This was my first time posting a question to
the Stata listserve, and I found it incredibly helpful.

On Wed, Oct 17, 2012 at 2:25 PM, Austin Nichols <> wrote:
> Josh Hyman <>:
> Taking the mean of Y for values of X near X0 *is* a regression; you
> are calculating the conditional mean of Y. What you describe is a
> zero-degree local polynomial regression in -lpoly- (a regression on
> just a constant), which is inadvisable (though -lpoly- default
> behavior) for the reasons given in the -lpoly- manual entry. Better to
> regress on X and interactions (all in deviation form from point X0)
> and predict at X=X0.  I recommend you start with a simple example with
> say 100 values of a one-dimensional X and try calculating the means of
> Y at (say) 10 values using a couple different approaches, to get a
> sense of what you are doing.  Then generalize to 100*100 values of X1
> and X2 and calculate mean Y at (say) 100 points on that grid.
> Did you look at
> (multivariate kernel density estimation)?
> Ask John DiNardo if you have conceptual questions--if he is currently
> accessible to you at the Ford school--the big ideas may easier to
> explain in person.
> On Wed, Oct 17, 2012 at 1:04 PM, Josh Hyman <> wrote:
> > Hi Austin (and others),
> >
> > Thank you very much for your reply. Sorry about my delayed response -
> > I wanted to investigate more to make sure I understood your
> > suggestion.
> >
> > I'm not sure your suggestion gets me exactly what I was looking for,
> > and I want to clarify. My reference to -lpoly- in my initial post may
> > have been confusing. I don't actually want to do kernel-weighted local
> > regressions. I want to estimate "multivariate kernel regression",
> > which to my understanding, doesn't actually involve any regressions at
> > all. It takes the weighted average of Y for all observations near to
> > the particular value of X, weighted using the kernel function. And
> > where X represents more than 2 variables. So, this actually seems the
> > same to me as multivariate kernel density estimation, which I also
> > don't see any user-written commands for in Stata. What I am looking
> > for, I guess is like a version of -kdens2- that allows for more than
> > one "xvar", and wouldn't output a graph (since it would be in greater
> > than 3 dimensions), but rather would output the fitted or predicted
> > values of the Y (like -predict, xb-) for each observation.
> >
> > Regardless, it sounds like given your suggestion, one way to do this
> > is to loop over all possible combinations of the values of the X
> > variables and calculate the weighted Y for each combination using the
> > kernel of my choice? Please let me know if this would be your
> > suggestion, or if given my further clarification, if you know of any
> > user-written commands in Stata to do this, or if you have any other
> > suggestions.
> >
> > Thanks a lot for your help, and sorry again for the delayed response.
> > Josh
> >
> >
> > On Fri, Oct 12, 2012 at 3:31 PM, Austin Nichols <> wrote:
> >> Josh Hyman <>:
> >> If you know the multivariate kernel you want to use, and the grid you
> >> want to smooth over, it is straightforward to loop over the grid and
> >> compute the regressions.  To program a general estimator for a wide
> >> class of kernels would be substantially more work.  See e.g. -kdens-
> >> on SSC and
> >>
> >>
> >>
> >> A simple conic (triangle) kernel in 2 dimensions is easiest, see e.g.
> >>
> >>
> >> On Fri, Oct 12, 2012 at 1:49 PM, Josh Hyman <> wrote:
> >>> Dear Statalist users,
> >>>
> >>> I am trying to figure out if there is a way in Stata to perform
> >>> multivariate kernel regression. I have investigated online and on the
> >>> Statalist, but with no success. What I am looking for would be similar
> >>> conceptually to the -lpoly- command, but with the ability to enter more
> >>> than one "xvar".
> >>>
> >>> If there are no Stata commands to do this (user-written or otherwise), then
> >>> do you recommend coding up a program to do this manually? I have used Stata
> >>> for many years, and written programs before, but have never had to code up
> >>> a regression manually. If you have suggestions on how to do this, or
> >>> resources to consult, that would be greatly appreciated.
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index