Matthew White <mwhite@poverty-action.org>

statalist@hsphsun2.harvard.edu

Re: st: _N in by-groups

Fri, 19 Aug 2011 11:20:39 +0300

Sorry, I think I'm still confused... Let's say I don't want dispby to display just the number of observations in the by-group (which your program does) but anything (including _N). So sysuse auto bys foreign: dispby "Hello world" should display "Hello world" for each group, and sysuse auto bys foreign: dispby _N should display 52 then 22. Something like this could work: program dispby, byable(recall) if _by() { marksample touse preserve qui keep if `touse' } disp `0' if _by() restore end The two examples above work. But I can't imagine this being the best way, especially with large data sets... Thanks, Matt On Fri, Aug 19, 2011 at 11:02 AM, Phil Clayton <philclayton@internode.on.net> wrote: > . program dispby, byable(recall) > 1. marksample touse > 2. count if `touse' > 3. end > > . bys foreign: dispby > > --------------------------------------------------------------------------------------------------------------- > -> foreign = Domestic > 52 > > --------------------------------------------------------------------------------------------------------------- > -> foreign = Foreign > 22 > > This is documented in [P] byable > > Phil > > On 19/08/2011, at 5:41 PM, Matthew White wrote: > >> So if I execute: >> sysuse auto >> bys foreign: drop if _n == _N >> Then two observations are dropped because _N is the number of >> observations in the by-group. >> >> But in this (admittedly silly) example, _N seems to be the number of >> observations in the data set: >> program dispby, byable(recall) >> disp `0' >> end >> sysuse auto >> bys foreign: dispby _N >> Both times 74 is displayed, instead of 52 in the first by-group and 22 >> in the second. >> >> Thanks, >> Matt >> >> On Fri, Aug 19, 2011 at 10:07 AM, Maarten Buis <maartenlbuis@gmail.com> wrote: >>> On Fri, Aug 19, 2011 at 2:21 AM, Matthew White wrote: >>>> If a Stata command has by-groups, it seems like _N is interpreted >>>> sometimes as the number of observations in the by-group and sometimes >>>> as the number of observations in the data set. >>> >>> If you use the -by :- prefix it is always defined as the number of >>> observations within each by-group. Stata would be a pretty lousy >>> program if such a scalar randomly changed meaning... >>> >>> If you want the total number of observations than I would just do: >>> >>> local Ntot = _N >>> >>> or inside a program that previously used -marksample-: >>> >>> count if `touse' >>> local Ntot = r(N) >>> >>> later you do >>> >>> by <somevar> `touse' : <something using _N and `Ntot'> if `touse' >>> >>> Hope this helps, >>> Maarten >>> >>> -------------------------- >>> Maarten L. Buis >>> Institut fuer Soziologie >>> Universitaet Tuebingen >>> Wilhelmstrasse 36 >>> 72074 Tuebingen >>> Germany >>> >>> >>> http://www.maartenbuis.nl >>> -------------------------- >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> >> >> >> -- >> Matthew White >> Evaluation Coordinator >> Urban Micro-Insurance Project >> Innovations for Poverty Action >> >> +254 (0)701 025 276 >> mwhite@poverty-action.org >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Matthew White Evaluation Coordinator Urban Micro-Insurance Project Innovations for Poverty Action +254 (0)701 025 276 mwhite@poverty-action.org * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

