Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.

From   "Nick Cox" <>
To   <>
Subject   RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.
Date   Mon, 28 Jun 2010 18:35:54 +0100

I can answer one of these questions; otherwise I am not clear what you
are puzzled about as I can't see any problem with the code suggested. 

The number of observations for the composite residuals variable should
be the sum of the numbers of observations included in the separate
regressions. If any observation was excluded from a regression, the
corresponding residual should be missing. 

That would be a consequence of your data, which we can't see. 

Minima and maxima should match, as I understand it. 


Dani Tilley

Sorry, I completely missed that. 

I also tried a loop structurally similar to the one you suggested, but
noticed the summarize res* output is different from the summarize
residuals output from NJC's suggestion. I understand that your loop
stores the residuals in separate variables (one for each category),
while NJC creates an empty variable and populates it on the fly, but
shouldn't say the minimum or maximum residuals from the two outputs
match? Shouldn't the smallest value from the min column of summarize
res* (MW) output be the same as the Min from summarize residuals (NJC)?
In addition shouldn't the sum of the obs column from the summarize res*
(MW) output be _N? 

I'm very new to Stata, so I don't really know if this makes sense at all
but I think this is the correct way to get the residuals using the loop
you suggested:

predict res`lev' if country == `lev', res

From: Martin Weiss <>

Having -drop-ped it, you cannot access it anymore. But NJC`s strategy is
that the results you are interested in are gathered inside the permanent
"residual" variable, so this is not a drawback.

Dani Tilley

If I define a tempvar and drop it at the end of the loop, can I still
to it elsewhere in the program (i.e. outside the loop)?

From: Nick Cox <>

If you are doing this lots of times for real, you could end up with
storage problems with dozens of temporary variables. If that doesn't
bite, then OK. 

Martin Weiss


drop `foo'

line could be safely omitted, btw. Stata just makes up new tempnames,
discards them all at the conclusion of the do-file.

Nick Cox

Such residuals have rather poorly defined properties, but let's set that
on one side. 

A single variable can be obtained through a minor variation on Martin's

sysuse auto, clear
qui levelsof rep78
gen residual = . 

foreach lev in `r(levels)'{
    tempvar foo 
    qui regress price weight length if rep78==`lev'
    predict `foo', res
    replace residual = `foo' if rep78 == `lev'
    drop `foo' 


Martin Weiss

Just loop through the thing:

sysuse auto, clear
qui levelsof rep78

foreach lev in `r(levels)'{
    qui regress price weight length if rep78==`lev'
    predict res`lev', res

Dani Tilley

I'm trying to run several regressions (one for each level of a
variable) and store the residuals from each regression in a local macro
variable I could later manipulate. I figured I could use:

bysort category: regress y x1 x2 

to run the regressions, but I need a second line of code (predict name,
residuals) to get the residuals when bysort allows only one. Is there a
around this? 

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index