Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: _N in by-groups


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: _N in by-groups
Date   Fri, 19 Aug 2011 09:45:52 +0100

The example makes clear why you made your otherwise very puzzling claim.

There is no contradiction. In your second example,

1. you are running a program for each group of -foreign-

2. that program is in effect just "display the number of observations"
and makes no reference to the -by:- group.

There is dissociation is this second example, but not in the first
example, which behaves as you expect.

The dissociation arises because "_N" is passed as text to the program.
_N is evaluated _within_ the program and not within the context of
-by:-. So _N, as it were, never sees the -by:- and is not influenced
by it. Declaring the program -byable(recall)- makes no difference
here.

It's a subtle example!

Nick

On Fri, Aug 19, 2011 at 8:41 AM, Matthew White
<mwhite@poverty-action.org> wrote:
> So if I execute:
> sysuse auto
> bys foreign: drop if _n == _N
> Then two observations are dropped because _N is the number of
> observations in the by-group.
>
> But in this (admittedly silly) example, _N seems to be the number of
> observations in the data set:
> program dispby, byable(recall)
> disp `0'
> end
> sysuse auto
> bys foreign: dispby _N
> Both times 74 is displayed, instead of 52 in the first by-group and 22
> in the second.
>
>
> On Fri, Aug 19, 2011 at 10:07 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>> On Fri, Aug 19, 2011 at 2:21 AM, Matthew White wrote:
>>> If a Stata command has by-groups, it seems like _N is interpreted
>>> sometimes as the number of observations in the by-group and sometimes
>>> as the number of observations in the data set.
>>
>> If you use the -by :- prefix it is always defined as the number of
>> observations within each by-group. Stata would be a pretty lousy
>> program if such a scalar randomly changed meaning...
>>
>> If you want the total number of observations than I would just do:
>>
>> local Ntot = _N
>>
>> or inside a program that previously used -marksample-:
>>
>> count if `touse'
>> local Ntot = r(N)
>>
>> later you do
>>
>> by <somevar> `touse' : <something using _N and `Ntot'> if `touse'
>>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index