Home  /  Products  /  Stata 19  /  Financial regression

← See Stata 19's new features

Highlights

  • Capital asset pricing model (CAPM)

  • Single-factor or multifactor models

  • Gibbons–Ross–Shanken test

  • Fama–MacBeth procedure

  • Shanken correction for errors in variables

  • See more in financial statistics features

Want to explore how asset returns relate to the market or estimate the price of risk? With the new finregress command, you can fit a capital asset pricing model (CAPM) or a Fama–Macbeth regression. This feature is a part of StataNow™.

One goal of financial statistics is to understand how asset returns are related to independent variables. For example, one might ask whether the returns on a collection of assets tend to move with or against the returns on the stock market as a whole. An investor may wish to know whether a particular asset is likely to move with the market or move against the market and act as a hedge. We can investigate these types of asset behaviors by using financial regression models.

Let's see it work

To demonstrate, we have data on the monthly stock prices of 25 fictional firms, the monthly price of a market index, and the annualized risk-free rate.

. webuse finex
(Fictional stock price data)

First, we use finreturns to transform the asset price data into returns. Prices tend to drift over time, but returns tend to be stationary. Hence, most financial regressions are performed on returns rather than prices themselves.

. finreturns acme-tks, nopreview log(lnr_) multiply(100)
(all returns multiplied by 100)

Log returns generated for variables acme, bat, iron, dune, tyr, glo, spa,
wgt, bar, yum, aaa, afh, ard, cph, das, dil, ege, epg, goa, jml, khc, krg,
kth, nhb, and tks.

. finreturns sp500, log(lnr_mkt) multiply(100)
(all returns multiplied by 100)

Log returns generated for variable sp500:

datem sp500 lnr_mkt
1. 1955m1 35.6 .
2. 1955m2 36.79 3.2880431
3. 1955m3 36.5 -.79138085
4. 1955m4 37.76 3.3938081
5. 1955m5 37.6 -.42462909
6. 1955m6 39.78 5.6360223
7. 1955m7 42.69 7.0600427
8. 1955m8 42.43 -.61090416
9. 1955m9 44.34 4.4031545
10. 1955m10 42.11 -5.1601962

The finreturns command created 25 new variables containing the log monthly returns for each of our 25 asset prices. The multiply() option scales the resulting log returns by the specified amount. When returns are small, log returns approximate simple returns, and multiplying by 100 allows us to interpret them as approximate percentage changes.

Let's see how the returns on the first five assets are related to the market return.

. finregress capm lnr_acme-lnr_spa = lnr_mkt

Capital asset pricing model

Sample: 1955m2 thru 2019m12                                Number of obs = 779

Robust
Coefficient std. err. z P>|z| [95% conf. interval]
lnr_acme
lnr_mkt .1282271 .0110202 11.64 0.000 .1066279 .1498264
_cons .3697878 .0400891 9.22 0.000 .2912145 .4483611
lnr_bat
lnr_mkt 1.533844 .0114511 133.95 0.000 1.5114 1.556288
_cons -.1308898 .0422271 -3.10 0.002 -.2136534 -.0481261
lnr_iron
lnr_mkt 1.876732 .0124702 150.50 0.000 1.852291 1.901173
_cons -.2841646 .0460956 -6.16 0.000 -.3745103 -.193819
lnr_dune
lnr_mkt 1.884469 .0126562 148.90 0.000 1.859663 1.909275
_cons -.2533514 .0470278 -5.39 0.000 -.3455242 -.1611786
lnr_tyr
lnr_mkt 1.087829 .0116772 93.16 0.000 1.064942 1.110716
_cons -.0057813 .0443976 -0.13 0.896 -.0927989 .0812364
lnr_glo
lnr_mkt .5175029 .0164595 31.44 0.000 .4852429 .5497629
_cons .3950725 .0615303 6.42 0.000 .2744754 .5156696
lnr_spa
lnr_mkt -.0973818 .0116994 -8.32 0.000 -.1203123 -.0744514
_cons .4421334 .0403738 10.95 0.000 .3630021 .5212647

The output has one block for each asset. The slope coefficients are just regression coefficients. For instance, when the log market return rises by one point (about one percentage point), the expected log return on acme tends to be up by 0.13 points (about 0.13 percentage points), whereas the expected log return on bat tends to be up by 1.53 points. The expected log returns on iron and dune are highly sensitive to the market, rising about 1.88 points whenever the log market return is up one point. Such stocks are said to be aggressive in that their returns swing more than one for one with the market. The expected log return for tyr, by contrast, moves almost one for one with the log return on the market as a whole. spa shows an altogether different pattern: its expected log return tends to fall modestly when the log market return is up, falling by about 0.10 whenever the log market return is up one point. Such an asset is said to be a hedge because it moves against the market and hence tends to pay out when the market is down.

Often, researchers and practitioners are interested in the returns an asset can provide over and above a risk-free rate. Incorporating a risk-free rate into the financial regression above is easy with the rfrate() option.

The fedfunds variable is an annualized risk-free rate. We first transform it to be on the same scale as the asset returns.

. generate rf = 100*ln(1+fedfunds/100)/12

Now we use rf as our risk-free rate, adjusting both the left-hand side and the right-hand side of the regression. We run adjusted regressions for the first seven assets.

. finregress capm lnr_acme-lnr_spa = lnr_mkt, rfrate(rf) adjust

Capital asset pricing model

Sample: 1955m2 thru 2019m12                                Number of obs = 779

Robust
Coefficient std. err. z P>|z| [95% conf. interval]
lnr_acme
lnr_mkt .1316993 .0102147 12.89 0.000 .111679 .1517197
_cons .0322367 .0378174 0.85 0.394 -.041884 .1063575
lnr_bat
lnr_mkt 1.530731 .0114377 133.83 0.000 1.508314 1.553148
_cons .0760026 .0416309 1.83 0.068 -.0055925 .1575977
lnr_iron
lnr_mkt 1.871524 .012273 152.49 0.000 1.847469 1.895579
_cons .0556327 .0454103 1.23 0.221 -.0333698 .1446352
lnr_dune
lnr_mkt 1.878887 .0123726 151.86 0.000 1.854637 1.903137
_cons .0895071 .046235 1.94 0.053 -.0011119 .1801261
lnr_tyr
lnr_mkt 1.087576 .0117232 92.77 0.000 1.064599 1.110553
_cons .0282078 .0443328 0.64 0.525 -.0586828 .1150984
lnr_glo
lnr_mkt .520314 .0162644 31.99 0.000 .4884364 .5521917
_cons .2080802 .0603508 3.45 0.001 .0897947 .3263657
lnr_spa
lnr_mkt -.0923812 .0104152 -8.87 0.000 -.1127946 -.0719678
_cons .0171066 .0376464 0.45 0.650 -.0566789 .0908922
Notes: Dependent variables adjusted for risk-free rate rf. Independent variable lnr_mkt adjusted for risk-free rate rf.

The dependent variables are now the log asset-returns net of the risk-free rate, and the independent variable (or factor) is the log market return in excess of the risk-free rate. The slope coefficients report the response of excess log asset returns to movements in excess log market return. The intercepts now can be interpreted as average excess log returns when the excess log market return is 0. Thus, these intercepts reflect average excess log returns that can be earned on each asset.

Under some asset pricing models, a key assumption is that if the model is correctly specified, then all variation in excess returns is captured by the explanatory factors (the excess log market return in this case). Thus, in those models, the key implication is that the intercepts in the regressions are jointly 0. estat grstest performs this test, which is known as the Gibbons–Ross–Shanken (1989) test.

. estat grstest, finite

Gibbons–Ross–Shanken test
H0: All intercept terms are zero

  No. of dependent vars. =      7
No. of independent vars. =      1
     No. of time periods =    779

               F(7, 771) =  2.118
                Prob > F = 0.0396

The null hypothesis is that all asset return regressions have intercepts of 0. The alternate is that at least one intercept is not 0. Either result is interesting. If we fail to reject the null hypothesis, then average returns are fully captured by the explanatory factors. If we reject the null, at least one asset delivers excess returns that are not explained by the factors. In the test above, we reject the null at the 0.05 significance level 0.05, so average returns are not fully explained by lnr_mkt.

A multifactor model explains the movements of individual assets through multiple independent variables. Our fictional dataset includes the variable vol, a volatility index. We explore now whether asset returns are related to volatility movements.

. finregress capm lnr_acme-lnr_spa = lnr_mkt vol, rfrate(rf) adjust(lnr_mkt)

Capital asset pricing model

Sample: 1955m2 thru 2019m12                                Number of obs = 779

Robust
Coefficient std. err. z P>|z| [95% conf. interval]
lnr_acme
lnr_mkt .1483932 .0119328 12.44 0.000 .1250054 .171781
vol -.0966061 .0356362 -2.71 0.007 -.1664517 -.0267604
_cons -.0086329 .0419574 -0.21 0.837 -.0908679 .0736021
lnr_bat
lnr_mkt 1.52412 .0131144 116.22 0.000 1.498417 1.549824
vol .0382555 .0400207 0.96 0.339 -.0401837 .1166947
_cons .0921868 .0450832 2.04 0.041 .0038254 .1805482
lnr_iron
lnr_mkt 1.851205 .0145758 127.01 0.000 1.822636 1.879773
vol .1175864 .0443886 2.65 0.008 .0305864 .2045865
_cons .1053782 .0479793 2.20 0.028 .0113406 .1994159
lnr_dune
lnr_mkt 1.872238 .014166 132.16 0.000 1.844473 1.900003
vol .0384778 .0399057 0.96 0.335 -.039736 .1166915
_cons .1057853 .049148 2.15 0.031 .0094571 .2021135
lnr_tyr
lnr_mkt 1.083229 .0131535 82.35 0.000 1.057448 1.109009
vol .0251582 .0392261 0.64 0.521 -.0517235 .1020398
_cons .0388511 .0483987 0.80 0.422 -.0560086 .1337109
lnr_glo
lnr_mkt .5050067 .0184809 27.33 0.000 .4687848 .5412287
vol .0885819 .051256 1.73 0.084 -.011878 .1890419
_cons .2455552 .0639172 3.84 0.000 .1202798 .3708307
lnr_spa
lnr_mkt -.0898467 .0120459 -7.46 0.000 -.1134562 -.0662372
vol -.0146668 .0342318 -0.43 0.668 -.08176 .0524264
_cons .0109018 .0401828 0.27 0.786 -.0678551 .0896586
Notes: Dependent variables adjusted for risk-free rate rf. Independent variable lnr_mkt adjusted for risk-free rate rf.

As before, the dependent variables are log returns in excess of the risk-free rate. We now specify adjust(lnr_mkt) to adjust only the log market return for the risk-free rate while the volatility variable remains unadjusted. The coefficients report sensitivities of each asset's excess log return to each independent variable. We see that all asset returns respond only weakly to the volatility variable, indicating that most assets' returns are not strongly associated with changes in volatility.

We run estat grstest again to see if the addition of volatility as an independent variable has shrunk the intercept terms.

. estat grstest, finite

Gibbons–Ross–Shanken test
H0: All intercept terms are zero

  No. of dependent vars. =      7
No. of independent vars. =      2
     No. of time periods =    779

               F(7, 770) =  2.531
                Prob > F = 0.0141

There is still evidence of nonzero intercepts. The search for explanatory factors could continue.

What we have done so far is explore how asset returns respond to explanatory factors. Next we will look at how average returns vary with the degree of responsiveness.

The idea is to understand how average returns across assets vary or do not vary with riskiness, as measured by the regression coefficients estimated previously. Assets can have high or low coefficients. But is it true that the assets with high coefficients also tend to have higher average returns? If coefficients are a measure of risk, do risky assets pay off more on average?

The Fama–MacBeth procedure is one way to answer this question. It proceeds in two steps. First, we run a regression like the previous one: for each asset, we obtain a coefficient estimate that is a measure of risk. Second, we regress the cross-section of asset returns on the risk measures. We run this regression for each time period and average them. The result is a single number called the price of risk, which measures how average returns change when an asset's riskiness changes.

To demonstrate, we use finregress fmb and consider all 25 of the fictional stocks at once.

. finregress fmb lnr_acme-lnr_tks = lnr_mkt vol, rfrate(rf) adjust(lnr_mkt)

Fama–MacBeth regression

Sample: 1955m2 thru 2019m12                            Number of obs     = 779
Method: OLS                                            Number of depvars =  25

depvar_means Coefficient Std. err. z P>|z| [95% conf. interval]
beta
lnr_mkt .1298006 .127215 1.02 0.308 -.1195363 .3791375
vol 1.383778 .3676425 3.76 0.000 .6632122 2.104344
_cons .0907121 .0199399 4.55 0.000 .0516305 .1297936
Dependent variables: lnr_acme lnr_bat lnr_iron lnr_dune lnr_tyr lnr_glo lnr_spa lnr_wgt lnr_bar lnr_yum lnr_aaa ... lnr_tks Notes: Dependent variables adjusted for risk-free rate rf. Independent variable lnr_mkt adjusted for risk-free rate rf.

The coefficient on lnr_mkt in the output above captures how average excess log returns vary with riskiness. As an asset's first-stage coefficient (beta risk) rises by one unit, its expected average excess log return rises by 0.13. The coefficient on vol measures how expected average excess log return rises with volatility risk. Even though most assets had coefficients near 0 in the first stage, it turns out that across the 25 stocks, the ones that have high covariance with vol also have high returns. By contrast, assets with a high covariance with the market (lnr_mkt) do not systematically show much higher average return.

We have fit two types of models that address two types of questions. The CAPM fit by finregress capm captures the degree to which asset returns vary with explanatory factors. The Fama–MacBeth regression fit by finregress fmb captures the degree to which average asset returns vary with risk.

Reference

Gibbons, M. R., S. A. Ross, and J. Shanken. 1989. A test of the efficiency of a given portfolio. Econometrica 57: 1121–1152. https://doi.org/10.2307/1913625.

Tell me more

Read more about financial regression in [FIN] finregress capm and [FIN] finregress fmb in the Stata Financial Statistics Reference Manual.

Learn more about Stata's financial statistics features.

View all the new features in Stata 19 and, in particular, new in financial statistics.

Ready to get started?

Experience powerful statistical tools, reproducible workflows, and a seamless user experience—all in one trusted platform.