[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Jacob Wegelin <jwegelin@vcu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: row mean (mean across columns) |

Date |
Wed, 8 Oct 2008 14:38:53 -0400 (EDT) |

Given any dataset of all numeric variables, I want to generate a new variable called myMean, which is the arithmetic mean (the average) across all the variables. The program below solves this problem. But surely there is a one-line command that will perform this task in Stata? The post http://www.stata.com/statalist/archive/2008-09/msg00597.html appears to contain a bug, in the sense that the row total computed is not corrected as in my code below. This should be done in a general manner: (1) As in the current dataset, the variables will not necessarily be in a form like a1 to a100. (2) The number of variables is arbitrary, so I cannot hard-code the denominator as when myMeanByHand is computed below. (3) If any value in a row is missing (.), the mean computed must also be missing, since then the mean across all variables is not defined. (Thus egen rowtotal is not the answer.) Here is the code: /* Generate a toy dataset */ clear set obs 5 gen x= _n gen zoo = 20-x gen whiskey=(_N - x) ^ 2 replace x = . in 2 /* First compute "by hand" with hard-coded denominator and variable names */ gen myMeanByHand= (x + zoo + whiskey ) / 3 sort x save tmp, replace list drop myMeanByHand capture program drop computeMeanAcrossColumns program define computeMeanAcrossColumns /* Compute arithmetic mean across all columns */ tempvar RowTotalTooMuch tempvar rowtotal scalar ncols=0 gen `RowTotalTooMuch'=0 foreach var of varlist * { quietly replace `RowTotalTooMuch'=`RowTotalTooMuch' + `var' scalar ncols=ncols + 1 } scalar nOrigCols=ncols-1 gen `rowtotal' = `RowTotalTooMuch' / 2 gen `1'= `rowtotal' / nOrigCols end computeMeanAcrossColumns "myMean" /* Check myMean against myMeanByHand */

assert _merge==3 drop _merge assert myMean==myMeanByHand drop myMeanByHand list /* An illustration with egen rowmean */ keep x zoo whiskey /* The following works for rows with no missing values. It gives a misleading answer for a row that contains a missing value, since the average in that row is not defined. */

drop junk /* A related question: The following gives an incorrect answer. What in the world is it doing? */

Thanks for any insights Jake Jacob A. Wegelin

Department of Biostatistics Virginia Commonwealth University 730 East Broad Street Room 3006 P. O. Box 980032 Richmond VA 23298-0032

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: row mean (mean across columns)***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: row mean (mean across columns)***From:*"Eva Poen" <eva.poen@gmail.com>

**Re: st: row mean (mean across columns)***From:*Maarten buis <maartenbuis@yahoo.co.uk>

- Prev by Date:
**SV: st: Imbalance in control versus treated group, and weights** - Next by Date:
**st: Adjusted Rates** - Previous by thread:
**st: USE10: New Stata module to load Stata 10 data in Stata 9** - Next by thread:
**Re: st: row mean (mean across columns)** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |