Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: _N in by-groups

From   Nick Cox <>
Subject   Re: st: _N in by-groups
Date   Fri, 19 Aug 2011 09:45:52 +0100

The example makes clear why you made your otherwise very puzzling claim.

There is no contradiction. In your second example,

1. you are running a program for each group of -foreign-

2. that program is in effect just "display the number of observations"
and makes no reference to the -by:- group.

There is dissociation is this second example, but not in the first
example, which behaves as you expect.

The dissociation arises because "_N" is passed as text to the program.
_N is evaluated _within_ the program and not within the context of
-by:-. So _N, as it were, never sees the -by:- and is not influenced
by it. Declaring the program -byable(recall)- makes no difference

It's a subtle example!


On Fri, Aug 19, 2011 at 8:41 AM, Matthew White
<> wrote:
> So if I execute:
> sysuse auto
> bys foreign: drop if _n == _N
> Then two observations are dropped because _N is the number of
> observations in the by-group.
> But in this (admittedly silly) example, _N seems to be the number of
> observations in the data set:
> program dispby, byable(recall)
> disp `0'
> end
> sysuse auto
> bys foreign: dispby _N
> Both times 74 is displayed, instead of 52 in the first by-group and 22
> in the second.
> On Fri, Aug 19, 2011 at 10:07 AM, Maarten Buis <> wrote:
>> On Fri, Aug 19, 2011 at 2:21 AM, Matthew White wrote:
>>> If a Stata command has by-groups, it seems like _N is interpreted
>>> sometimes as the number of observations in the by-group and sometimes
>>> as the number of observations in the data set.
>> If you use the -by :- prefix it is always defined as the number of
>> observations within each by-group. Stata would be a pretty lousy
>> program if such a scalar randomly changed meaning...
>> If you want the total number of observations than I would just do:
>> local Ntot = _N
>> or inside a program that previously used -marksample-:
>> count if `touse'
>> local Ntot = r(N)
>> later you do
>> by <somevar> `touse' : <something using _N and `Ntot'> if `touse'
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index