Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Creating missing data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Creating missing data
Date   Thu, 4 Nov 2004 08:54:56 -0000

I can't reproduce this. The key is, or should be, 
the condition 

_n == 1 

to restrict changes to the first record. 

I did this to get a toy example: 

set obs 100
egen id = seq() , to(10) 
egen time = seq(), block(10) 
drop if inlist(id,3,5) & time > 1 
tab id
bysort id (time) : gen multiple = _N > 1
egen ID = group(id) if multiple 
tab ID 
gen somevar = uniform() 
bysort ID (time) : replace somevar = . if _n == 1 & mod(ID,5) == 0 
l 

and this works for me. 

Nick 
n.j.cox@durham.ac.uk 

celdjt@umich.edu
> 
> Thanks for the help on this but the procedure does not quite 
> do it. Or, 
> more precisely, it overdoes it. That is, instead of replacing with a 
> missing value only the entry time for the first record for 
> every fifth 
> multiple-record case, it replaces all entry times with 
> missing values for 
> all records in every fifth multiple-record case. I am aiming 
> to replace 
> only the first entry time value in the first record for every fifth 
> multiple-record case.
> 
> --On Thursday, November 04, 2004 1:45 AM +0000 Nick Cox 
> <n.j.cox@durham.ac.uk> wrote:
> 
> > Correct. I guess the -egen- should just select those cases.
> >
> > . bysort whateverisyourid : gen multiple = _N > 1
> > . egen id = group(whateverisyourid) if multiple
> > . bysort id (entrytime):
> >          replace somevar = . if _n == 1 & mod(id,5) == 0
> >
Keith Dear

> >> But celdjt said he wanted to do this for every fifth
> >> *multiple-record*
> >> case. He could first drop the single-record cases, apply
> >> Nick's code, then
> >> append them back. Or is there a neater way?
> >> kd
> >>
> >> > Suppose your identifiers run 1, 2, ... . If this isn't true,
> >> >
> >> > . egen id = group(whateverisyourid)
> >> >
> >> > will make it so.
> >> >
> >> > Now, if I understand you correctly, you want to
> >> >
> >> > . bysort id (entrytime):
> >> >         replace somevar = . if _n == 1 & mod(id,5) == 0
 >> > celdjt@umich.edu
> >> >
> >> > > My
> >> > > question is about making an adjustment to the surrogate
> >> data set. The
> >> > > surrogate data set contains about 3500 cases, some of which
> >> > > have just one
> >> > > record per case, and others have multiple records. Each
> >> record has a
> >> > > variable recording an entering time and an exit time for
> >> that record.
> >> > > Hence, single record cases have one entry time and one exit
> >> > > time; multiple
> >> > > record cases have multiple entry and multiple exit times. I
> >> > > would like to
> >> > > convert the first entry time for every fifth multiple-record
> >> > > case in this
> >> > > data set into a missing value. Are there any suggestions for
> >> > > how this might be done.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index