Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: egen first/lastnm


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: egen first/lastnm
Date   Sun, 24 Feb 2008 18:13:29 -0000

The issue Devra raises can be answered by looking at the code. Here I
focus on the      
-egen- add-on -first()-. The same issue arises with -lastnm()-. You
could see the code 
below from within Stata by typing 

. ssc type _gfirst.ado 

What is wired into the code through the -marksample- statement is that
missings on the 
variable supplied are segregated throughout. Thus, as Devra reports,
missings 
are mapped to missings. 

*! 1.0.0 NJC 31 May 2000 
program define _gfirst 
	version 6.0
	gettoken type 0 : 0
      gettoken g 0 : 0
      gettoken eqs 0 : 0
      syntax varname [if] [in] [, BY(varlist) ] 
	marksample touse, strok
	tempvar order 
	gen long `order' = _n 
	sort `touse' `by' `order' 
	* ignore user-supplied `type' 
	local type : type `varlist' 
      qui by `touse' `by' : gen `type' `g' = `varlist'[1] if `touse'
end

Below is a hack that would populate observations with appropriate
non-missings whenever 
they exist. 

*! 1.1.0 NJC Statalist 24 Feb 2008 
* _gfirst 1.0.0 NJC 31 May 2000 
program define _gfirst2  
        version 6.0
        gettoken type 0 : 0
        gettoken g 0 : 0
        gettoken eqs 0 : 0
        syntax varname [if] [in] [, BY(varlist) ] 
	marksample touse, strok novarlist 
	tempvar order missing 
	gen long `order' = _n 
	gen byte `missing' = missing(`varlist') 
	sort `touse' `by' `missing' `order' 
	* ignore user-supplied `type' 
	local type : type `varlist' 
        qui by `touse' `by' : gen `type' `g' = `varlist'[1] if `touse'
end

Alternatively, surgery after Devra's example would be 

bysort id (y) : replace y = y[1] 
bysort id (z) : replace z = z[1] 

Nick
n.j.cox@durham.ac.uk 

Devra Golbe

-egen(newvar) = first(varname)-

(from the egenmore functions) produces missing values when I did not 
expect that behavior. newvar is missing for observations in which 
varname is missing. The same is true for -egen newvar = 
lastnm(varname)-  Is that the behavior I should have expected? In 
contrast, -egen (newvar) = mean(varname) populates newvar even if 
varname is missing. See the example below my signature

input n id x

             n         id          x
  1. 1 1 10
  2. 2 1 9
  3. 3 1 11
  4. 4 2 12
  5. 5 2 .
  6. 6 2 11
  7. 7 3 .
  8. 8 3 .
  9. 9 3 10
 10. end

. egen y = first(x), by(id)
(3 missing values generated)

 egen z = lastnm(x), by(id)
(3 missing values generated)

egen m=mean(x), by(id)

. list

     +------------------------------+
     | n   id    x    y    z      m |
     |------------------------------|
  1. | 1    1   10   10   11     10 |
  2. | 2    1    9   10   11     10 |
  3. | 3    1   11   10   11     10 |
  4. | 4    2   12   12   11   11.5 |
  5. | 5    2    .    .    .   11.5 |
     |------------------------------|
  6. | 6    2   11   12   11   11.5 |
  7. | 7    3    .    .    .     10 |
  8. | 8    3    .    .    .     10 |
  9. | 9    3   10   10   10     10 |
     +------------------------------+



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index