# Re: st: -stset- with stock-sampling, frailty and TVC

 From Antoine Terracol To statalist@hsphsun2.harvard.edu Subject Re: st: -stset- with stock-sampling, frailty and TVC Date Tue, 07 Jul 2009 16:29:54 +0200

Thanks Bobby,

I must confess that I asked the question because I have been running some exploratory Monte-Carlo simulations, and wanted to be sure my -stset- command was the correct one...

For anyone interrested, the code below defines a d0 evaluator (admitedly not very efficient, but it (seems to) work...) for a weibull model with gamma frailty, stock sampling and time varying covariates. The DGP for the simulations is of a heterogenous weibull with at most one change in X, possibly before "stock" sampling (I did not consider censoring in the DGP). I then estimate the "official" -streg- command and my evaluator, and compare the results.

The parameters of the DGP are:
- p=0.9
- theta=1
- _cons=1
- _b[x]=1

the results of 50 replications are shown below (the first panel is from the "official" command, the second from my evaluator). As you can see, the bias can be quite strong, at least in the case i'm considering.

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
xo |        50    .9024034    .1302959   .6543105   1.243127
conso |        50     .663616    .2160223   .3573504   1.648682
thetao |        50    .3738764    .2111808   .0428884   1.226914
po |        50     .776937    .1916066   .3290025   1.571802
-------------+--------------------------------------------------------
x2 |        50    .9975985    .0979764   .7751212   1.232066
cons2 |        50    1.027832    .2697684   .6704028   2.282072
theta2 |        50    .9958155    .2684908   .5904011   2.047184
p2 |        50     .909505    .1374526    .678086    1.47397

Antoine

______________________________________________________

cap prog drop mwhet2
program define mwhet2
args todo b lnf
tempvar xb lntheta lnp
mleval `xb'=`b', eq(1)
mleval `lntheta'=`b', eq(2) scalar
mleval `lnp'=`b', eq(3) scalar
tempvar theta p
qui gen double `theta'=exp(`lntheta')
qui gen double `p'=exp(`lnp')
tempvar haz  surv  surve inthaz vinthaz inthaze vinthaze last
tempvar xbp
qui bysort id (seq) : gen double `xbp'=`xb'[_n+1]
qui gen double `inthaz'=(\$ML_y1^`p')*(exp(`xb')-exp(`xbp'))
qui bysort id (seq) : replace `inthaz'=(\$ML_y1^`p')*(exp(`xb')) if _n==_N
qui bysort id : egen double  `vinthaz'=total(`inthaz')
qui gen double `inthaze'=(\$ML_y3^`p')*(exp(`xb')-exp(`xbp'))
qui bysort id (seq) : replace `inthaze'=(\$ML_y3^`p')*(exp(`xb')) if \$ML_y3==\$ML_y3[_n+1]
qui  replace `inthaze'=0 if \$ML_y3<\$ML_y1
qui bysort id : egen double `vinthaze'=total(`inthaze')
qui gen double `haz'=(exp(`xb')*`p'*\$ML_y1^(`p'-1))/(1+`theta'*`vinthaz')
qui bysort id (seq) : g `last'=_n==_N
qui gen double `surv'=(1+`theta'*`vinthaz')^(-1/`theta')
qui gen double `surve'=(1+`theta'*`vinthaze')^(-1/`theta')
tempvar l
qui gen double `l' = (\$ML_y2*ln(`haz')+ln(`surv')-ln(`surve'))*`last'
mlsum `lnf'=`l'
end

cap prog drop mcweib
program define mcweib, rclass
drop _all
set obs 500
g id=_n
g m=ln(rgamma(1,1))

g tbar=runiform()*2
drawnorm x
g u=runiform()
gen t=(-ln(1-u)/exp(1+x+m))^(1/0.9)
g expand=1+(t>tbar)
expand expand
drop expand
bysort id : g seq=_n
bysort id (seq) : replace x=x[_n-1]+2 if _n==2
bysort id (seq) : g double time=tbar if _n==1 & _N==2
bysort id (seq) : replace time=t if _N==1
bysort id (seq) : g double tau=exp(-(exp(1+x[1]+m)-exp(1+x[2]+m))*(tbar^0.9))
replace time=(-ln((1-u)/tau)/exp(1+x+m))^(1/0.9) if time==.
bysort id (time) : g fail=_n==_N

/*stock sampling*/
su time if fail, meanonly
drop fail
g tstock=0.25*runiform()*r(mean)
bysort id (seq) : replace tstock=tstock[1]
bysort id (seq) : drop if tstock>time[_N]
bysort id (seq) : g expand=tstock<time & _n==1
bysort id (seq) : replace expand=tstock<time  & expand[_n-1]==0 if _n==2
replace expand=expand+1
expand expand
bysort id seq : replace time=tstock if _n==1 & _N==2
bysort id (time) : g fail=_n==_N
g avant=time<=tstock
drop seq
bysort id (time) : g seq=_n

stset time, id(id) f(fail) enter(tstock)
streg x, dist(weib) fr(gamma)
return scalar betaxo=[_t][x]
return scalar betaconso=[_t][_cons]
return scalar thetao=exp([ln_the][_cons])
return scalar po=exp([ln_p][_cons])

replace tstock=time if tstock>time

ml model d0 mwhet2 (time fail tstock = x ) (lntheta:) (lnp:)
ml max , diff
return scalar betax2=[eq1][x]
return scalar betacons2=[eq1][_cons]
return scalar theta2=exp([lntheta][_cons])
return scalar p2=exp([lnp][_cons])
end

simulate xo=r(betaxo) conso=r(betaconso) thetao=r(thetao) po=r(po) ///
x2=r(betax2) cons2=r(betacons2) theta2=r(theta2) p2=r(p2) ///
, reps(50) : mcweib

su, sep(4)

____________________________________________________________________

Roberto G. Gutierrez, StataCorp wrote:

Suppose I have a stock-sample of duration data where I happen to know the
values of the covariates before the first observation of the individuals
(because of a retrospective questionnaire, for example).

A typical individual might look like:

id   x    time  fail  tstock
1    1    1.5    0     2
1    2    2.5    0     2
1    3    4      1     2

I am interested in estimating a parametric model with frailty.

What would be the correct way to -stset- my data so that Stata uses the path
of the covariates before the sampling date?

if I specify

. stset time, id(id) fail(fail) enter(tstock)

Stata will discard the first row of the individual in the above example (_st
will be 0).

In a model without frailty, this would not matter since the sub-survival
functions between 0 and enter() disappear from the likelihood.

With frailty, however, this is not true since we deal with the ratio of
unconditional survival functions (integrated w.r.t the distribution of the
frailty). Using the above -stset- command, we in fact assume that the
covariate is constant from time 0 to the first observed value with
date>=enter()

Unfortunately, there is no current -stset- syntax that will incorporate
covariate patterns at times when subjects are not considered "under
observation" into the analysis.  However, there is merit to the idea of being
able to incorporate such retrospecitive information into the analysis.  As
such, it is something we'll look into.

--Bobby
rgutierrez@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

--
Ce message a ete verifie par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a ete trouve.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/