Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Extract and save R2 as a variable from looping regressions

From   Steve Samuels <>
Subject   Re: st: Extract and save R2 as a variable from looping regressions
Date   Tue, 1 Oct 2013 12:43:19 -0400


The FAQ ask that you show exactly the code you ran and what
Stata typed. The code you show would have stopped with errors.

• . gen gr_compIDmofd=group (companyID mofd)

is illegal. You must have used -egen-

• . clear e(r2)

is also illegal

•  A variable is not  a proper list for use with -foreach-.  That
accounts for the fact that you have R2 only for the first group
encountered in the data set.

• Your -bysort- line ran the regression for every group at each pass in
the loop. You would have seen results of K^2 regressions, where K is the
number of groups. 

-statsby- offers a much simpler solution. 
I extract adjusted R2, as ordinary
R2 is dependent on sample size.

 sysuse auto, clear
 statsby r2a = e(r2_a), by(rep78 foreign) ///
   saving(new, replace): reg mpg weight, robust
 use new, clear

Here's a loop that works.

sysuse auto, clear
egen grp = group(foreign rep78)
levelsof grp, local(g)
gen r2a = .
foreach x of local g{
qui reg mpg weight if grp==`x', robust
   replace r2a = e(r2_a) if grp==`x'
sort foreign rep78	
egen first = tag(foreign rep78)
list rep78 foreign r2a if first

Note that running many regressions without
checking model fit can be dangerous.  So, I recommend, e.g.

. scatter mpg weight || lfit mpg weight, by(foreign rep78)


On Sep 30, 2013, at 1:24 PM, Aleksej Rechytskyi wrote:

Hello everyone,

I have problems with extracting and storing R2 as a variable from looping regressions. My dataset is a panel dataset including company ID, date, daily stock return and daily market return. It is sorted by the company ID (permno) and for each company ID I there is a running time series.

Company ID || date || stock return || daily market return
1 || 01.01.1990 || y || x
1 || 02.01.1990 || y || x
1|| 03.01.1990 || y || x
2|| 01.01.1990 || y || x
2|| 02.01.1990 || y || x
2|| 03.01.1990 || y || x

I need to regress the stock return on the market return in a loop for every company ID and month on the basis of the daily returns. I constructed the following:

gen R2=.
gen mofd = mofd(date)
gen gr_compIDmofd=group (companyID mofd)
local i=1
foreach i in gr_compIDmofd {
bysort gr_compIDmofd: regress stock_return market_return, robust
replace R2_restri=e(r2) if gr_compIDmofd ==`i'
clear e(r2)
local i=`i'+1

Instead of replacing the missing value in R2 by the e(r2) for EACH DISTINCT gr_compIDmofd, the code replaces ALL missing values with the R2 from the first regression irrespective of the gr_compIDmofd:

Company ID || date || stock return || daily market return||gr_compIDmofd||R2
1 || 01.01.1990 || y || x||1||0.5
1 || 02.01.1990 || y || x||1||0.5
1|| 03.01.1990 || y || x||1||0.5
2|| 01.01.1990 || y || x||2||0.5
2|| 02.01.1990 || y || x||2||0.5
2|| 03.01.1990 || y || x||2||0.5

Thank you very much in advance for your help.

Best regards,
Aleksej Rechytskyi

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index