Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.

 From Dani Tilley To statalist@hsphsun2.harvard.edu Subject Re: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line. Date Mon, 28 Jun 2010 10:50:54 -0700 (PDT)

```Thanks for your response. I also think they should match and the # on the obs should be the the number of the observations used in the regression, not the total observations. The MW snippet is missing a condition in the  "predict res`lev', res" line. If you compare the residuals from these two, you'll notice the discrepancy.

///MW
sysuse auto, clear
qui levelsof rep78

foreach lev in
`r(levels)'{
qui regress price weight length if rep78==`lev'
predict res`lev', res
}

///NJC
sysuse auto, clear
qui levelsof rep78
gen residual = .

foreach lev in `r(levels)'{
tempvar foo
qui regress price weight length if rep78==`lev'
predict `foo', res
replace residual = `foo' if rep78 == `lev'
drop `foo'
}

If, in MW, we say "predict res`lev' if rep78==`lev', res", the problem is fixed. This is all I meant.

Thanks a lot to Martin and you for the help.

Best,
DF Tilley

----- Original Message ----
From: Nick Cox <n.j.cox@durham.ac.uk>
To: statalist@hsphsun2.harvard.edu
Sent: Mon, June 28, 2010 1:35:54 PM
Subject: RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.

I can answer one of these questions; otherwise I am not clear what you
are puzzled about as I can't see any problem with the code suggested.

The number of observations for the composite residuals variable should
be the sum of the numbers of observations included in the separate
regressions. If any observation was excluded from a regression, the
corresponding residual should be missing.

That would be a consequence of your data, which we can't see.

Minima and maxima should match, as I understand it.

Nick
n.j.cox@durham.ac.uk

Dani Tilley

Sorry, I completely missed that.

I also tried a loop structurally similar to the one you suggested, but
noticed the summarize res* output is different from the summarize
residuals output from NJC's suggestion. I understand that your loop
stores the residuals in separate variables (one for each category),
while NJC creates an empty variable and populates it on the fly, but
shouldn't say the minimum or maximum residuals from the two outputs
match? Shouldn't the smallest value from the min column of summarize
res* (MW) output be the same as the Min from summarize residuals (NJC)?
In addition shouldn't the sum of the obs column from the summarize res*
(MW) output be _N?

I'm very new to Stata, so I don't really know if this makes sense at all
but I think this is the correct way to get the residuals using the loop
you suggested:

predict res`lev' if country == `lev', res

From: Martin Weiss <martin.weiss1@gmx.de>

Having -drop-ped it, you cannot access it anymore. But NJC`s strategy is
that the results you are interested in are gathered inside the permanent
"residual" variable, so this is not a drawback.

Dani Tilley

If I define a tempvar and drop it at the end of the loop, can I still
refer
to it elsewhere in the program (i.e. outside the loop)?

From: Nick Cox <n.j.cox@durham.ac.uk>

If you are doing this lots of times for real, you could end up with
storage problems with dozens of temporary variables. If that doesn't
bite, then OK.

Martin Weiss

The

*************
drop `foo'
*************

line could be safely omitted, btw. Stata just makes up new tempnames,
and
discards them all at the conclusion of the do-file.

Nick Cox

Such residuals have rather poorly defined properties, but let's set that
on one side.

A single variable can be obtained through a minor variation on Martin's
recipe:

sysuse auto, clear
qui levelsof rep78
gen residual = .

foreach lev in `r(levels)'{
tempvar foo
qui regress price weight length if rep78==`lev'
predict `foo', res
replace residual = `foo' if rep78 == `lev'
drop `foo'
}

Nick
n.j.cox@durham.ac.uk

Martin Weiss

Just loop through the thing:

*************
sysuse auto, clear
qui levelsof rep78

foreach lev in `r(levels)'{
qui regress price weight length if rep78==`lev'
predict res`lev', res
}
*************

Dani Tilley

I'm trying to run several regressions (one for each level of a
categorical
variable) and store the residuals from each regression in a local macro
or
new
variable I could later manipulate. I figured I could use:

bysort category: regress y x1 x2

to run the regressions, but I need a second line of code (predict name,
residuals) to get the residuals when bysort allows only one. Is there a
way
around this?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```