[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Multivariate kernel regression
Austin Nichols <firstname.lastname@example.org>
Re: st: Multivariate kernel regression
Fri, 19 Oct 2012 18:03:48 -0400
Josh Hyman <email@example.com>:
Just looked at
briefly, but it does not seem to have been written by someone well
versed in Stata programming.
It seems to compute the univariate ROT bandwidth for kernel density
estimates (see p.892 of the Stata manual entry on -kdensity- for the
same formula), not conditional mean (polynomial order zero) estimates
(see p.1009 of the Stata manual entry on -lpoly- for the very
different formula), in each dimension completely separately, which
seems like a terrible idea. You would be better off computing
Mahalanobis distance and using a conic kernel, then doing some kind of
cross validation to get a good bandwidth. Plus, that kernreg.ado just
computes zero-order polynomial regressions, so you are much better off
writing your own program that estimates a linear surface (hyperplane)
at each point.
On Fri, Oct 19, 2012 at 12:20 PM, Josh Hyman <firstname.lastname@example.org> wrote:
> Thank you so much Austin and Shan.
> Shan - I very much appreciate your pointing out the .ado files on
> Manski's webpage, in particular kernreg.ado and gridgen.ado . These
> will be a great place for me to start, and seem very to be very
> similar to what Austin recommended I try starting with. Ideally I
> would like to use slightly more than 4 covariates, but this is
> terrific for now, and I will see if I can augment the code to accept a
> few more.
> Austin - Thanks a lot for your suggestions. I met with John DiNardo
> recently about this project, but haven't asked him about the
> multivariate kernel regression. I sent him an email yesterday to see
> if he will discuss this with me. I will begin by coding up your
> suggestion to help me understand. Your explanation was very helpful
> for me in understanding how the multivariate kernel regression is
> Thanks again to you both! This was my first time posting a question to
> the Stata listserve, and I found it incredibly helpful.
> On Wed, Oct 17, 2012 at 2:25 PM, Austin Nichols <email@example.com> wrote:
>> Josh Hyman <firstname.lastname@example.org>:
>> Taking the mean of Y for values of X near X0 *is* a regression; you
>> are calculating the conditional mean of Y. What you describe is a
>> zero-degree local polynomial regression in -lpoly- (a regression on
>> just a constant), which is inadvisable (though -lpoly- default
>> behavior) for the reasons given in the -lpoly- manual entry. Better to
>> regress on X and interactions (all in deviation form from point X0)
>> and predict at X=X0. I recommend you start with a simple example with
>> say 100 values of a one-dimensional X and try calculating the means of
>> Y at (say) 10 values using a couple different approaches, to get a
>> sense of what you are doing. Then generalize to 100*100 values of X1
>> and X2 and calculate mean Y at (say) 100 points on that grid.
>> Did you look at http://fmwww.bc.edu/repec/bocode/t/tddens
>> (multivariate kernel density estimation)?
>> Ask John DiNardo if you have conceptual questions--if he is currently
>> accessible to you at the Ford school--the big ideas may easier to
>> explain in person.
>> On Wed, Oct 17, 2012 at 1:04 PM, Josh Hyman <email@example.com> wrote:
>> > Hi Austin (and others),
>> > Thank you very much for your reply. Sorry about my delayed response -
>> > I wanted to investigate more to make sure I understood your
>> > suggestion.
>> > I'm not sure your suggestion gets me exactly what I was looking for,
>> > and I want to clarify. My reference to -lpoly- in my initial post may
>> > have been confusing. I don't actually want to do kernel-weighted local
>> > regressions. I want to estimate "multivariate kernel regression",
>> > which to my understanding, doesn't actually involve any regressions at
>> > all. It takes the weighted average of Y for all observations near to
>> > the particular value of X, weighted using the kernel function. And
>> > where X represents more than 2 variables. So, this actually seems the
>> > same to me as multivariate kernel density estimation, which I also
>> > don't see any user-written commands for in Stata. What I am looking
>> > for, I guess is like a version of -kdens2- that allows for more than
>> > one "xvar", and wouldn't output a graph (since it would be in greater
>> > than 3 dimensions), but rather would output the fitted or predicted
>> > values of the Y (like -predict, xb-) for each observation.
>> > Regardless, it sounds like given your suggestion, one way to do this
>> > is to loop over all possible combinations of the values of the X
>> > variables and calculate the weighted Y for each combination using the
>> > kernel of my choice? Please let me know if this would be your
>> > suggestion, or if given my further clarification, if you know of any
>> > user-written commands in Stata to do this, or if you have any other
>> > suggestions.
>> > Thanks a lot for your help, and sorry again for the delayed response.
>> > Josh
>> > On Fri, Oct 12, 2012 at 3:31 PM, Austin Nichols <firstname.lastname@example.org> wrote:
>> >> Josh Hyman <email@example.com>:
>> >> If you know the multivariate kernel you want to use, and the grid you
>> >> want to smooth over, it is straightforward to loop over the grid and
>> >> compute the regressions. To program a general estimator for a wide
>> >> class of kernels would be substantially more work. See e.g. -kdens-
>> >> on SSC and
>> >> http://fmwww.bc.edu/repec/bocode/m/mf_mm_kern
>> >> http://fmwww.bc.edu/RePEc/bocode/k/kdens.pdf
>> >> A simple conic (triangle) kernel in 2 dimensions is easiest, see e.g.
>> >> http://fmwww.bc.edu/repec/bocode/t/tddens
>> >> On Fri, Oct 12, 2012 at 1:49 PM, Josh Hyman <firstname.lastname@example.org> wrote:
>> >>> Dear Statalist users,
>> >>> I am trying to figure out if there is a way in Stata to perform
>> >>> multivariate kernel regression. I have investigated online and on the
>> >>> Statalist, but with no success. What I am looking for would be similar
>> >>> conceptually to the -lpoly- command, but with the ability to enter more
>> >>> than one "xvar".
>> >>> If there are no Stata commands to do this (user-written or otherwise), then
>> >>> do you recommend coding up a program to do this manually? I have used Stata
>> >>> for many years, and written programs before, but have never had to code up
>> >>> a regression manually. If you have suggestions on how to do this, or
>> >>> resources to consult, that would be greatly appreciated.
* For searches and help try: