Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: egen first/lastnm


From   Devra Golbe <dgolbe@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: egen first/lastnm
Date   Sun, 24 Feb 2008 13:48:41 -0500

Thanks, Nick!

Devra

Nick Cox wrote:
The issue Devra raises can be answered by looking at the code. Here I
focus on the -egen- add-on -first()-. The same issue arises with -lastnm()-. You
could see the code below from within Stata by typing
. ssc type _gfirst.ado
What is wired into the code through the -marksample- statement is that
missings on the variable supplied are segregated throughout. Thus, as Devra reports,
missings are mapped to missings.
*! 1.0.0 NJC 31 May 2000 program define _gfirst version 6.0
gettoken type 0 : 0
gettoken g 0 : 0
gettoken eqs 0 : 0
syntax varname [if] [in] [, BY(varlist) ] marksample touse, strok
tempvar order gen long `order' = _n sort `touse' `by' `order' * ignore user-supplied `type' local type : type `varlist' qui by `touse' `by' : gen `type' `g' = `varlist'[1] if `touse'
end

Below is a hack that would populate observations with appropriate
non-missings whenever they exist.
*! 1.1.0 NJC Statalist 24 Feb 2008 * _gfirst 1.0.0 NJC 31 May 2000 program define _gfirst2 version 6.0
gettoken type 0 : 0
gettoken g 0 : 0
gettoken eqs 0 : 0
syntax varname [if] [in] [, BY(varlist) ] marksample touse, strok novarlist tempvar order missing gen long `order' = _n gen byte `missing' = missing(`varlist') sort `touse' `by' `missing' `order' * ignore user-supplied `type' local type : type `varlist' qui by `touse' `by' : gen `type' `g' = `varlist'[1] if `touse'
end

Alternatively, surgery after Devra's example would be
bysort id (y) : replace y = y[1] bysort id (z) : replace z = z[1]
Nick
n.j.cox@durham.ac.uk
Devra Golbe

-egen(newvar) = first(varname)-

(from the egenmore functions) produces missing values when I did not expect that behavior. newvar is missing for observations in which varname is missing. The same is true for -egen newvar = lastnm(varname)- Is that the behavior I should have expected? In contrast, -egen (newvar) = mean(varname) populates newvar even if varname is missing. See the example below my signature

input n id x

n id x
1. 1 1 10
2. 2 1 9
3. 3 1 11
4. 4 2 12
5. 5 2 .
6. 6 2 11
7. 7 3 .
8. 8 3 .
9. 9 3 10
10. end

. egen y = first(x), by(id)
(3 missing values generated)

egen z = lastnm(x), by(id)
(3 missing values generated)

egen m=mean(x), by(id)

. list

+------------------------------+
| n id x y z m |
|------------------------------|
1. | 1 1 10 10 11 10 |
2. | 2 1 9 10 11 10 |
3. | 3 1 11 10 11 10 |
4. | 4 2 12 12 11 11.5 |
5. | 5 2 . . . 11.5 |
|------------------------------|
6. | 6 2 11 12 11 11.5 |
7. | 7 3 . . . 10 |
8. | 8 3 . . . 10 |
9. | 9 3 10 10 10 10 |
+------------------------------+

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index