Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Averaging across variables

From   David Kantor <[email protected]>
To   [email protected]
Subject   Re: st: Averaging across variables
Date   Thu, 14 Jul 2005 10:42:34 -0400

I should add some warnings to what Ying wrote in reply to Kelly Johnson's inquiry.

The command
egen avginc = rmean(inc78 - inc80)

depends on having the desired variables ordered such that they are covered by the varlist
"inc78 - inc80". That is, it depends on the present order of the variables. (Stata will *not* try to infer that this includes inc79 based on the *form* of the varlist expression.) It is generally not a good idea to write code with such a dependence. It would be better to explicitly spell out the variables, or use star notation, such as rmean(inc*) -- if inc* covers exactly the variables you want.

I often put the varlist into a macro:
local incvars "inc78 inc79 inc80"
egen avginc = rmean(`incvars')

(You can also put a spelled-out version of a star-notation varlist into a macro using -unab-, and then manipulate the resulting macro with macrolist operation -- such as deleting specific variables.)

Finally, be sure to use rmean, and not mean.
-egen ... rmean()- takes a varlist and computes a row mean.
-egen ... mean()- takes an expression and computes a column mean.

egen avginc = mean(inc78 - inc80)
looks almost the same as the rmean version, but is vastly different. It will take the difference inc78 - inc80 (the result of subtracting), and yield the column sum of that difference.

-- David

At 09:30 AM 7/14/2005 -0400, you wrote:


I am not sure whether this is what you want. Suppose you have income
varialbe named inc78, inc79 and inc80. You created a new variable called
avginc which is the mean of (inc78 - inc80).  You use this command.

egen avginc = rmean(inc78 - inc80)

Do a search on Stata by "egen".

Good luck.


[from Kelly Johnson]> hi folks,
> suppose i have a data set with say, many variables.
> how can i create a new variable that is the (arithmetic) mean of all the
> variables starting with certain letters (i.e. using the '*' operator)?
> thanks again!
> kj
David Kantor
Institute for Policy Studies
Johns Hopkins University
[email protected]

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index