- Comparison of means
- Cohen's
*d* - Hedges's
*g* - Glass's Δ
- Point/biserial correlation
- Estimated from data or published summary statistics

- Cohen's
- Variance explained by regression and ANOVA
- Eta-squared and partial eta-squared (η
^{2}) - Epsilon-squared and partial epsilon-squared (ε
^{2}) - Partial statistics estimated from data
- Overall statistics from data or published summary statistics

- Eta-squared and partial eta-squared (η
- With confidence intervals

**esize**, **esizei**, and **estat esize** calculate measures of effect
size for (1) the difference between two means and (2) the
proportion of variance explained.

Say we have data on mothers and their infants' birthweights. We want to calculate the effect size on birthweight of smoking during pregnancy:

. esize twosample bwt, by(smoke) allEffect size based on mean comparison Obs per group: Nonsmoker = 115 Smoker = 74

Effect size | Estimate [95% conf. interval] | |

Cohen's d | .3938497 .0985333 .6881322 | |

Hedges's g | .3922677 .0981375 .685368 | |

Glass's Delta 1 | .3756723 .0787487 .6709925 | |

Glass's Delta 2 | .4283965 .1267939 .7272194 | |

Point-Biserial r | .1897497 .0482935 .3199182 | |

We find that the difference in average birthweight is about 0.4 standard deviations.

We can reasonably assume birthweight is normally distributed; thus the reported confidence intervals are appropriate in this case.

In many cases, normality cannot reasonably be assumed. In such cases, we can obtain bootstrapped confidence intervals:

. bootstrap r(d) r(g), reps(200) nowarn seed(111):>esize twosample bwt, by(smoke)(runningesizeon estimation sample) Bootstrap replications (200) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 Bootstrap results Number of obs = 189 Replications = 200 Command: esize twosample bwt, by(smoke) _bs_1: r(d) _bs_2: r(g)

Observed Bootstrap Normal-based | ||

coefficient std. err. z P>|z| [95% conf. interval] | ||

_bs_1 | .3938497 .1391761 2.83 0.005 .1210697 .6666298 | |

_bs_2 | .3922677 .138617 2.83 0.005 .1205833 .663952 | |

When you have summary statistics but not the underlying data, as you might when reading a journal article, you can use Stata's immediate command. Let's pretend our birthweight example was published. The hypothetical article recorded that for the 115 mothers who did not smoke, the average birthweight was 3,054.957 grams (sd=752.409) and that for the 74 smokers, the average was 2772.297 grams (sd=659.8075). We type

.esizei 115 3054.957 752.409 74 2772.297 659.807541Effect size based on mean comparison Obs per group: Group 1 = 115 Group 2 = 74

Effect size | Estimate [95% conf. interval] | |

Cohen's d | .3938508 .0985343 .6881333 | |

Hedges's g | .3922687 .0981385 .685369 | |

We can use the **estat esize** postestimation command to calculate
effect sizes after fitting ANOVA models.

We fit a full factorial model of newborn birthweight on mother's smoking status and whether the mother saw a doctor during her first trimester:

.anova bwt smoke##drvisitNumber of obs = 189 R-squared = 0.0471 Root MSE = 717.382 Adj R-squared = 0.0317

Source | Partial SS df MS F Prob>F | ||

Model | 4707585.5 3 1569195.2 3.05 0.0299 | ||

smoke | 3275249.7 1 3275249.7 6.36 0.0125 | ||

drvisit | 612385.43 1 612385.43 1.19 0.2768 | ||

smoke#drvisit | 248303.95 1 248303.95 0.48 0.4882 | ||

Residual | 95207713 185 514636.29 | ||

Total | 99915299 188 531464.35 |

We can obtain the proportion of variability explained (effect
sizes) measured by η^{2}, ε^{2}, or ω^{2}. Here is the
default η^{2} measure:

.estat esizeEffect sizes for linear models

Source | Eta-squared df [95% conf. interval] | |

Model | .0471158 3 . .1062782 | |

smoke | .033257 1 .0014433 .0975557 | |

drvisit | .006391 1 . .0474531 | |

smoke#drvisit | .0026012 1 . .0361357 | |

Reported are full and partial η^{2} values *along with their
confidence intervals*. We could have added the **epsilon** or **omega** option
to instead request the ε^{2} or ω^{2} measure.

We can also use the **estat esize** postestimation command to calculate
effect sizes after fitting linear models.

We replace the insignificant **drvisit** variable with the continuous
variable **age** and fit the model using linear regression.

.regress bwt smoke##c.age

Source | SS df MS | Number of obs = 189 | |

F(3, 185) = 4.55 | |||

Model | 6859112.22 3 2286370.74 | Prob > F = 0.0042 | |

Residual | 93056186.4 185 503006.413 | R-squared = 0.0686 | |

Adj R-squared = 0.0535 | |||

Total | 99915298.6 188 531464.354 | Root MSE = 709.23 |

bwt | Coefficient Std. err. t P>|t| [95% conf. interval] | |

smoke | ||

Smoker | 797.9369 484.3249 1.65 0.101 -157.5731 1753.447 | |

age | 27.60058 12.14868 2.27 0.024 3.632806 51.56835 | |

smoke#c.age | ||

Smoker | -46.51558 20.44641 -2.28 0.024 -86.85368 -6.177479 | |

_cons | 2408.383 292.1796 8.24 0.000 1831.951 2984.815 | |

This time, we request the ω^{2} estimates of effect size:

.estat esize, omegaEffect sizes for linear models

Source | Omega-squared df | |

Model | .0532781 3 | |

smoke | .0090843 1 | |

age | -.0044019 1 | |

smoke#c.age | .0218418 1 | |

Reported are full and partial ω^{2} values.

If we did not have the data to estimate this model but instead
found the regression fit published in a journal, we could still
estimate the overall η^{2}, ε^{2}, and ω^{2} from the model's
degrees of freedom and the summary statistic that F(3, 185) = 4.55.
We could type

.esizei 3 185 4.55Effect sizes for linear models

Effect Size | Estimate [95% conf. interval] | |

Eta-squared | .0687138 .0079234 .1364187 | |

Epsilon-squared | .0536119 | |

Omega-squared | .0533434 | |

The ω^{2} agrees to three decimal places. Had we typed
4.5454107 rather than 4.55, we would have had full agreement to
the shown eight decimal places.

See the manual entry.