Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: preserve-restore and stset

From   Grethe Søndergaard <>
Subject   st: preserve-restore and stset
Date   Mon, 11 Oct 2010 13:45:12 +0200

Dear Statalist

I have a couple of questions about the preserve-restore procedure and stset.

My dataset:
id father-id mother-id death var ...
1 1 10 0 1
2 1 10 1 1
3 1 20 1 1
4 1 20 0 1
5 2 10 1 1
6 2 10 0 1
7 3 30 0 1
8 3 30 1 1
save \Temp\hs.dta", replace

I want to compare all maternal half siblings within a family as well
as all paternal siblings within a family. In order to do this, I start
out by creating an id-variable for full siblings, paternal half
siblings or maternal half siblings and afterwards I run preserve-restore:

   *full siblings*
egen gruppe = group(father-id mother-id)

   *maternal half siblings*
egen mgr = group(mother-id)

   *paternal half siblings*
egen fgr = group(father-id)

forvalues x = 1/8{

gen strata_mother = `x' if ((mgr==mgr[`x']) & gruppe != gruppe[`x']) |_n==`x'
   *strata in which no events occur*:
by strata_mother, sort: egen n_dead_mother = total(death)
replace strata_mother=. if n_dead_mother==0
  *strata with only one person*
by strata_mother, sort: egen n_var_mother = total(var)
replace strata_mother=. if n_var_mother==1

gen strata_father = `x' if ((fgr==fgr[`x']) & gruppe != gruppe[`x']) | _n==`x'
   *strata in which no events occur*:
by strata_father, sort: egen n_dead_father = total(death)
replace strata_father=. if n_dead_father==0
  *strata with only one person*
by strata_father, sort: egen n_var_father = total(var)
replace strata_father=. if n_var_father==1

drop if strata_mother==. & strata_father==.

if `x' == 1 {
save " \Temp\hs.dta", replace
else {
append using " \Temp\hs.dta"
save " \Temp\hs.dta", replace

I have the following questions:
1. Is there any way to make preserve-restore run faster (my dataset
contains more than 2 mil. observations so it takes about two days to
run it)
2. I am worried that creating “strata_father” after creating
“strata_mother” is problematic. Is it okay to do that in the same
preserve-restore statement?
3. I want to use “strata_father” and “strata_mother” as strata
variables in a cox regression analysis - and I want to perform the
analyses separately for females and males. Since preserve-restore runs
slowly, I want to state this after having run it. However, it seems as
if it doesn’t work to state that I only want to include e.g. males in
stset (if sex==M). As far as I can see, males who experience an event,
but who has no brothers but a half sister still counts as an
event. Is there any way to state in stset, that I only want to compare
males – and that I only want to include events, if the male who
experience it has one or more brothers?

I hope this is clear – but since I am not an experienced user of
Stata, please let me know if you need more details.

Thank you


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index