# RE: st: Plotting a Local Polynomial Regression with CIs Accounting for Clustering

 From "Nick Cox" To Subject RE: st: Plotting a Local Polynomial Regression with CIs Accounting for Clustering Date Tue, 1 Dec 2009 12:35:48 -0000

```I don't get a good sense of the statistical strategy here. You mix
statistical (and perhaps scientific) judgment in choosing what is
qualitatively correct and a wish to quantify that exactly with
confidence intervals that pay attention to clustering. The latter
exactness seems dubious in light of the former inexactness. I wonder how
that all is to be reported. I suppose we all do something like this much
of the time, however.

It seems that there is only one predictor. Given, that a scatter plot
with data and smooth goes most of the way to conveying variability
around the smooth.

In addition, I don't see why you couldn't just use -mkspline- to get
cubic splines and then use -regress- directly on the created variables.
-mkspline- only allows frequency weights but as long as you use
-regress- with weights as you wish you should get something like what
you want.

Nick
n.j.cox@durham.ac.uk

L S

I've been playing around with the fracpoly graphs for a couple days
now.  Compared to the local polynomial regression lines, they do not
look quite right.  The main thing is that the picture will depend
often depend fairly strongly on the number of degrees for the
fractional polynomial.  If you specify a number too small, the graph
will appear oversmoothed.  If you specify a number of degrees too
large, then the 95% CIs will often get very large.

fracpoly reg y x, cluster(id) degree(2)
fracplot, msymbol(none) addplot((function y=x))
fracpoly reg y x, cluster(id) degree(6)
fracplot, msymbol(none) addplot((function y=x))

In the toy data above this is not so bad, but it is more of an issue
with the real data.

I realize that choosing the degrees is a necesary choice.  It seems
though that lpoly (local polynomial) regression produces a graph for
my data that seems more reasonable.

Thus, though I said I was flexible with respect to which form of
nonparametric regression is used, I was wondering if there might be a
way to possibly return back to local polynomial regression or perhaps
another form of nonparametric regression (besides fracpoly) that will
allow me to plot 95% CIs accounting for clustering, e.g. something
like

twoway (lpolyci y x, cluster(id)) (line x x)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```