Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: _N in by-groups


From   Maarten Buis <[email protected]>
To   [email protected]
Subject   Re: st: _N in by-groups
Date   Fri, 19 Aug 2011 10:55:40 +0200

The real problem with your question is that you are not telling us
what you exactly want to do. With programming the devil is in the
detail. All we can do is answer with respect to your fake problem,
which somehow does not correspond well with what you really want to
do. It is no surprise that than our answer will confuse you...

As I said before: you should not use _N in that situation: it does
exactly what it should, but not what you want. Instead use
-marksample- and -if- in every subsequent command where you need it.
If you want to get access to the number of observations within your by
group, than we explained to you how to get them. The trick is to use
-count- together with -if `touse'- and the result is stored in r(N)
and you can store that number in a local macro for later access.   Do
not start dropping observations.

*----------- begin example ------------
program drop _all
program define foo, byable(recall)
	marksample touse
	
	di as txt "This is correct:"
	sum `0' if `touse'
	
	di as txt "This is wrong:"
	sum `0'
end

sysuse auto, clear
bys foreign: foo mpg
*--------- end example -----------

-- Maarten

On Fri, Aug 19, 2011 at 10:20 AM, Matthew White
<[email protected]> wrote:
> Sorry, I think I'm still confused... Let's say I don't want dispby to
> display just the number of observations in the by-group (which your
> program does) but anything (including _N). So
> sysuse auto
> bys foreign: dispby "Hello world"
> should display "Hello world" for each group, and
> sysuse auto
> bys foreign: dispby _N
> should display 52 then 22.
>
> Something like this could work:
> program dispby, byable(recall)
> if _by() {
> marksample touse
> preserve
> qui keep if `touse'
> }
> disp `0'
> if _by() restore
> end
>
> The two examples above work. But I can't imagine this being the best
> way, especially with large data sets...
>
> Thanks,
> Matt
>
> On Fri, Aug 19, 2011 at 11:02 AM, Phil Clayton
> <[email protected]> wrote:
>> . program dispby, byable(recall)
>>  1. marksample touse
>>  2. count if `touse'
>>  3. end
>>
>> . bys foreign: dispby
>>
>> ---------------------------------------------------------------------------------------------------------------
>> -> foreign = Domestic
>>   52
>>
>> ---------------------------------------------------------------------------------------------------------------
>> -> foreign = Foreign
>>   22
>>
>> This is documented in [P] byable
>>
>> Phil
>>
>> On 19/08/2011, at 5:41 PM, Matthew White wrote:
>>
>>> So if I execute:
>>> sysuse auto
>>> bys foreign: drop if _n == _N
>>> Then two observations are dropped because _N is the number of
>>> observations in the by-group.
>>>
>>> But in this (admittedly silly) example, _N seems to be the number of
>>> observations in the data set:
>>> program dispby, byable(recall)
>>> disp `0'
>>> end
>>> sysuse auto
>>> bys foreign: dispby _N
>>> Both times 74 is displayed, instead of 52 in the first by-group and 22
>>> in the second.
>>>
>>> Thanks,
>>> Matt
>>>
>>> On Fri, Aug 19, 2011 at 10:07 AM, Maarten Buis <[email protected]> wrote:
>>>> On Fri, Aug 19, 2011 at 2:21 AM, Matthew White wrote:
>>>>> If a Stata command has by-groups, it seems like _N is interpreted
>>>>> sometimes as the number of observations in the by-group and sometimes
>>>>> as the number of observations in the data set.
>>>>
>>>> If you use the -by :- prefix it is always defined as the number of
>>>> observations within each by-group. Stata would be a pretty lousy
>>>> program if such a scalar randomly changed meaning...
>>>>
>>>> If you want the total number of observations than I would just do:
>>>>
>>>> local Ntot = _N
>>>>
>>>> or inside a program that previously used -marksample-:
>>>>
>>>> count if `touse'
>>>> local Ntot = r(N)
>>>>
>>>> later you do
>>>>
>>>> by <somevar> `touse' : <something using _N and `Ntot'> if `touse'
>>>>
>>>> Hope this helps,
>>>> Maarten
>>>>
>>>> --------------------------
>>>> Maarten L. Buis
>>>> Institut fuer Soziologie
>>>> Universitaet Tuebingen
>>>> Wilhelmstrasse 36
>>>> 72074 Tuebingen
>>>> Germany
>>>>
>>>>
>>>> http://www.maartenbuis.nl
>>>> --------------------------
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>>
>>>
>>> --
>>> Matthew White
>>> Evaluation Coordinator
>>> Urban Micro-Insurance Project
>>> Innovations for Poverty Action
>>>
>>> +254 (0)701 025 276
>>> [email protected]
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Matthew White
> Evaluation Coordinator
> Urban Micro-Insurance Project
> Innovations for Poverty Action
>
> +254 (0)701 025 276
> [email protected]
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index