Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: preserve-restore


From   Ulrich Kohler <[email protected]>
To   [email protected]
Subject   Re: st: preserve-restore
Date   Thu, 14 Oct 2010 10:09:13 +0200

Am Donnerstag, den 14.10.2010, 08:16 +0200 schrieb Grethe Søndergaard:
> Dear Statalist
> 
> For some reason, the layout of the e-mail I sent to you a couple of
> days ago was quite messy, since a lot of symbols had sneaked into it
> after sending it. Therefore I am sending it again:
> 
> I have a couple of questions about the preserve-restore procedure and stset.
> 
> My dataset:
> id father-id mother-id death var ...
> 1 1 10 0 1
> 2 1 10 1 1
> 3 1 20 1 1
> 4 1 20 0 1
> 5 2 10 1 1
> 6 2 10 0 1
> 7 3 30 0 1
> 8 3 30 1 1
> ...
> save \Temp\hs.dta", replace
> 
> 
> I want to compare all maternal half siblings within a family as well
> as all paternal siblings within a family. In order to do this, I start
> out by creating an id-variable for full siblings, paternal half
> siblings or maternal half siblings and afterwards I run preserve-restore:
> 
> egen gruppe = group(father-id mother-id)
> egen mgr = group(mother-id)
> egen fgr = group(father-id)
> 
> 
> forvalues x = 1/8{
> preserve
> 
> *MATERNAL HALF SIBLINGS*
> gen strata_mother = `x' if ((mgr==mgr[`x']) & gruppe != gruppe[`x']) |_n==`x'
> 
> *PATERNAL HALF SIBLINGS*
> gen strata_father = `x' if ((fgr==fgr[`x']) & gruppe != gruppe[`x']) | _n==`x'
> 
> drop if strata_mother==. & strata_father==.
> 
> if `x' == 1 {
> save " \Temp\hs.dta", replace
> }
> else {
> append using " \Temp\hs.dta"
> save " \Temp\hs.dta", replace
> }
> restore
> }
> 
> I have the following questions:
> 1. Is there any way to make preserve-restore run faster (my dataset
> contains more than 2 mil. observations so it takes about two days to
> run it)
> 2. Is it problematic to create strata_father after creating
> strata_mother in the same preserve-restore statement?
> 3. I want to use strata_father and strata_mother as strata
> variables in a cox regression analysis - and I want to perform the
> analyses separately for females and males. Since preserve-restore runs
> slowly, I want to state this after having run it. However, it seems as
> if it doesnt work to state that I only want to include e.g. males in
> stset (if sex==M). Males who experience an event,
> but who has no brothers but a half sister still counts as an
> event. Is there any way to state in stset, that I only want to compare
> males and that I only want to include events, if the male who
> experience it has one or more brothers?
> 
> I hope this is clear but since I am not an experienced user of
> Stata, please let me know if you need more details.


I propose to put -preserve- outside the loop and use -restore, preserve-
at the end. This way, The file is only saved once. -restore, preserve-
restores the file, while keeping it preserved.


preserve
forv x = 1/8 {
   ...
   restore, preserve
}

In terms of speed it should be faster to save each of the "small" files
and do the append afterwards:


forv x = 1/8 {
  ...
  save f`i'
}

use f1
forv x = 2/8 [
   append using f`2'
}












*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index