Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: RE: what does _N mean under by varlist in the -egen- and -gen-?


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: RE: what does _N mean under by varlist in the -egen- and -gen-?
Date   Sat, 8 Aug 2009 22:20:41 +0200

<>



Well, you used -gen- twice and -egen- twice, -gen- behaved according to your
expectation twice. -egen- behaved as expected when fed an -expression-
involving _N, but failed you when you used the same -expression- in an -if-
qualifier statement. I am reluctant to call this a rule, but this is where
our little investigation stands. The entry in the data management manual
("[D]") seems to confirm this suspicion. 

As I showed in one of my posts today, it is easy to get what you want using
an -expression- like - ((sex[_N]==0)*sex)-. The part in the round
parantheses evaluates to one for all the observations in families that have
sex equal to zero for their last observation, so any summation over this
-expression- sums only the ones where the condition holds. 

BTW, you do not specify  your sort order within families, so should anything
happen to it along the way, the "_N"- observation would also change...



HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of gjhxmu@sina.com
Sent: Samstag, 8. August 2009 19:54
To: statalist
Subject: Re: st: RE: RE: what does _N mean under by varlist in the -egen-
and -gen-?

Martin,thank you for your help!

In -egen-, in my example sometimes _N means the total dataset, while
sometimes means not.
Is there a rule?
btw, where to see  " Explicit subscripting (using _N and _n), which is
commonly
used with generate, should not be used with egen;" See [D], p. 145? 
What is [D], p. 145?

Thank you very much!

Best regards,
Rose.



bys family: g x=sum(sex) if sex[_N]==0
bys family: egen xx=total(sex) if sex[_N]==0
bys family: g xxx=sex[_N]
bys family: egen xxxx=total(sex[_N]==0)
l,sepby(family) noobs
----- Original Message -----
From: Martin Weiss <martin.weiss1@gmx.de>
To: <statalist@hsphsun2.harvard.edu>
Subject: st: RE: RE: what does _N mean under by varlist in the -egen- and
-gen-?
Date: 2009-8-8 21:56:34


<>


The manuals make the problem clear, though: " Explicit subscripting (using
_N and _n), which is commonly
used with generate, should not be used with egen;" See [D], p. 145

- di in smcl " {manpage D 153: Click here!}"-



HTH
Martin


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Martin Weiss
Sent: Samstag, 8. August 2009 15:30
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: what does _N mean under by varlist in the -egen- and -gen-?


<>
You can of course form an expression that does yield the correct result:


***
clear
input family sex 
1 1 
1 1 
1 1 
2 0 
2 1 
2 0 
2 1 
2 0 
3 0 
3 1 
3 0 
3 1 
3 1 
3 1 

end

//complex expression fed to -total-
bys family: egen x=/*
*/total(sex*(sex[_N]==0))

//compare with -if- solution
bys family: egen xx=total(sex) /*
*/ if sex[_N]==0 

list, noobs sepby(fam)
***

HTH
Martin


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of gjhxmu@sina.com
Sent: Samstag, 8. August 2009 11:40
To: statalist
Subject: st: what does _N mean under by varlist in the -egen- and -gen-?

Dear statalists,

I know that under by varlist: _N is interpreted within each group of
observations, not for the whole dataset. 

However, I found the meaning _N in the -egen- is a little different.

Take the following for example,
clear
input family sex 
1 1 
1 1 
1 1 
2 0 
2 1 
2 0 
2 1 
2 0 
3 0 
3 1 
3 0 
3 1 
3 1 
3 1 

end
bys family: g x=sum(sex) if sex[_N]==0
bys family: egen xx=total(sex) if sex[_N]==0
bys family: g xxx=sex[_N]
bys family: egen xxxx=total(sex[_N]==0)
l,sepby(family) noobs

+------------------------------------+
| family sex x xx xxx xxxx |
|------------------------------------|
| 1 1 . . 1 0 |
| 1 1 . . 1 0 |
| 1 1 . . 1 0 |
|------------------------------------|
| 2 0 0 . 0 5 |
| 2 1 1 . 0 5 |
| 2 0 1 . 0 5 |
| 2 1 2 . 0 5 |
| 2 0 2 . 0 5 |
|------------------------------------|
| 3 0 . . 1 0 |
| 3 1 . . 1 0 |
| 3 0 . . 1 0 |
| 3 1 . . 1 0 |
| 3 1 . . 1 0 |
| 3 1 . . 1 0 |
+------------------------------------+

Based on the results, Obviously the meaning of _N in the first two is
different,
while the meaning of _N in the last two is the same.

Could anyone tell me why?

Thank you very much.

Best regards,
Rose.

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/


*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/


*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index