Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Get fitted values after locpoly (follow-up)


From   Partho Sarkar <[email protected]>
To   [email protected]
Subject   Re: st: Get fitted values after locpoly (follow-up)
Date   Wed, 21 Sep 2011 20:55:04 +0530

Tania

I think I see where you are coming from, and so just a quick pointer:

 You are probably thinking in terms of  "kernel regression" (or local
polynomial regression) as usually understood in the machine learning
literature, in which the bandwidth is *optimally* selected (or
"tuned") from  an available "training set" or "memory set" of (xi,yi)
points, and *this bandwidth, together with the training set data*, can
then be used to "predict" the y0 value at some previously "query"
point x0 outside the training set.  [In a sense, you could say that
the training set together with the bandwidht constitute the "model"].

But this is clearly not how locpoly is set up.  The bandwidth is
fixed-either by default or your choice.  And I am not sure, having
only tried a canned example with the program once very briefly, if
there is any scope to meaningfully partition the data into training
and query sets, as I think you might have in mind.  The user interface
certainly does not *explicitly* give the user such a choice. [But this
can be clarified by those more familiar with this command.]  There may
be possibly be a roundabout way to get an approximation to what I
think you have in mind. But if I wanted to do the kind of kernel
regression I mention above, I would (without knowing what other Stata
programs may be available for this) go to R's CRAN archives.  I worked
on this a few years ago, so let me know and I could try to dig up
some of the sources, or just search CRAN.

Hope this helps

Partho



On Wed, Sep 21, 2011 at 4:28 PM, Tania Treibich
<[email protected]> wrote:
> Dear Stata List users
>
> I could get fitted values for my kernel regression using the at()
> option of lpoly instead of the n() option:
>
> locpoly inv_rate l_kap, at(l_kap) generate (yfitted) degree(3)
> width(1.5) noscatter
>
> This indeed computes the smoothing and creates the fitted value
> yfitted for all the values of l_kap. However,  it gives too much
> weight to outliers.
>
> Instead, I would like the kernel regression to be computed only on a
> limited number of points (as in the option n(50) ) BUT get the fitted
> (approximated) value for ALL my observations.
>
> Thanks again for your help!!
>
> Tania
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index