Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Query about bootstrapping and R-squared

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Query about bootstrapping and R-squared Date Wed, 7 Dec 2011 09:56:55 +0000

```As you say, bootstrap samples can be expected to differ in R^2 and
-bootstrap- is not in denial on this.

You can -bootstrap- R^2 by

bootstrap e(r2) , reps(10000) : regress x y z

which will give you a standard error and confidence interval. I'm not
clear how this would provide a compensation for overfitting. (100
these days is measly.)

I'd argue back against the reviewers. What do they recommend instead?
One-predictor models? Zero-predictor models? Of course, any one can
agree that a bigger sample size would be better.

Nick

On Wed, Dec 7, 2011 at 9:35 AM, Nick Riches <nick.riches@newcastle.ac.uk> wrote:

> This may be more of a general stats query than a Stata query, but any pointers would be most appreciated.
>
> I'm running a commonality analysis (identifying unique and shared contributions to R-squared). The reviewers have commented on the lack of power / overfitting (2 predictors, 23 observations), so I thought I'd run a bootstrapped version to compensate for overfitting. The trouble is, the R-squared from the bootstrapped regression is identical to the R-squared in the non-bootstrapped regression. In addition the value of R-squared is completely unaffected by the number of reps.
>
> I can't see why this is the case. Surely each time the program randomly resamples, a different regression line is drawn, parameters and residuals will vary and therefore R-squared will vary?
>

> P.S. Syntax = bootstrap, reps(100): regress x y z
>
>
>
Dr Nick Riches
> Lecturer in Speech and Language Pathology
> Education Communication and Language Sciences (ECLS)
> King George VI building
> Queen Victoria Road
> Newcastle University
> NE1 7RU

```