I agree with Maarten that this is generally a bad idea.

One way to get what you want is a two-step egen max = max(var1), by(group) egen mean = mean(var1/(var1 < max)), by(group)

Nick On 19 Aug 2011, at 15:40, mcross@exemail.com.au wrote:

I too am confused regarding when _N is or isn't influenced by the –by :- prefix. I would like to remove a single outlier from each group within the following data set... input group var1 1 4 1 5 1 81 2 2 2 3 2 3 2 72 endI would then like to calculate the mean for each group (with theoutliersgone). I assumed that the following code would do the trick… by group (var1), sort: egen average = mean(var1) if var1 != var1[_N]When the mean was calculated – it did so following the –by :- prefix(i.e. _N = 3 for group 1). But following the –if- option, _N was calculated from the whole data set (i.e. _N = 7).I got around this problem by generating/sorting a byte tag, however,I stilldon’t understand WHY and HOW Stata does this. Could I have dealt with the above using a single line of code? Cheers, Mike (beginner Stata 8) * So _N, as it were, never sees the -by:- and is not influenced by it. ** If a Stata command has by-groups, it seems like _N is interpreted sometimes as the number of observations in the by-group and sometimes as the number of observations in the data set. *** If you use the -by :- prefix it is always defined as the number of observations within each by-group. Stata would be a pretty lousy program if such a scalar randomly changed meaning...

