# st: AW: Two part model - How to combine models?

 From "Baumeister Sebastian" To Subject st: AW: Two part model - How to combine models? Date Wed, 22 Aug 2007 20:30:17 +0200

```You can combine the estimates to calculate the difference in two groups using bootstrapping technique as illustrated in Afifi, Ettner et al, Annu Rev Public Health.2007;28:95-111

The steps involved are:

global indv male nonwite age income educyrs ins1 ins2
global pre income
gen orig=\$pre

program define tpm

*Run logit for any problem (eg, physican visit)
Logit any \$indv

*Calculate predicted probability at original regressor values
Preict p,p

Egen sdinc=sd(income)
Egen meaninc=mean(income)
Gen ave=mean
Gen high=mean+sd

*Calculate the predicted probability when continous predictor (income) 	*is set to mean (or binary predictor (gender) to 0)

Replace \$pre=ave
Predict p0, p

* Calculate the predicted probability when continous predictor 	*(income) is *set to mean plus one sd

Replace \$pre=high
Predict p1, p

*Calculate relative risk associated with increase in income from mean 	*to mean+1sd

gen rr = p1/p0

* Reset income back to original value
replace \$preg = orig
* Conditional linear regression of outpatient costs among those with 	* any visits
reg \$depv \$indv if any==1

*You will also have to consider log-transformation and smearing here 	*(see Afifi, Ettner et al, Annu Rev Public *Health.2007;28:95-111 or Manning, J Health Econ; 17: 283-95)

* Calculate conditional expectation at original regressor values
predict xb, xb
* Calculate conditional expectation when income is set to the mean
replace \$preg=ave
predict xb0, xb
* Calculate conditional expectation when income is set to the mean         *plus one standard deviation
replace \$preg=high
predict xb1, xb

* Calculate the difference in the conditional expectations (income set to mean vs. mean + SD)
gen cdiff = xb1-xb0

* Reset income back to original value
replace \$preg = orig

* Calculate unconditional expectation at original regressor values
gen pred = p*xb
* Calculate unconditional expectation when income is set to the mean
gen pred0 = p0*xb0
* Calculate unconditional expectation when income is set to the mean + 1 SD
gen pred1 = p1*xb1
* Compare means of actual visits and unconditional predicted visits at original regressor values
sum \$depv pred
* Calculate difference in unconditional expectation for income at mean vs. mean + 1 SD
gen udiff = pred1 ¨C pred0
* Do unconditional regression to reset sample to full sample for bootstrapping
quietly reg \$depv \$indv
end

* Call the program and make sure it works
tpm
* Look at the key results
su rr cdiff udiff

* CREATE PROGRAM TO CALL TPM AND DO THE BOOTSTRAPPING
program tpmboot, rclass

* Call the program to run the two-part model and get the point estimates of interest

tpm
tempname y1
sum rr, meanonly
scalar `y1' = r(mean)
return scalar rrboot=`y1¡¯
tempname y2
sum cdiff, meanonly

return scalar cdiffboot=`y2¡¯
tempname y3
sum udiff, meanonly
scalar `y3' = r(mean)
return scalar udiffboot=`y3¡¯
end
use newdatasetname, clear

* Set the seed first so that you can replicate the results if you rerun it later
set seed 8

* Call the bootstrap program, specifying the number of repetitions
* In a real study, you should use ¡Ý1000 repetitions if you want empirical confidence intervals
* Specify a small number of repetitions first to test the program
bootstrap ¡°tpmboot¡± r1=r(rrboot) r2=r(cdiffboot) r3=r(udiffboot), reps(10)

-----Urspr¨¹ngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Tamara Pejovic
Gesendet: Mittwoch, 22. August 2007 15:40
An: statalist@hsphsun2.harvard.edu
Betreff: st: Two part model - How to combine models?

Hi,

I have a quick question. I have a response variables that is positively
skewed and contain a substantial proportion of zeroes. Since a common
method for analyzing this type of data is a two-part model I have the
analysis of three stages:
The first involved creating two sets of data from the original: one
showing whether or not the problem is present and the other indicating
the "level of the problem" when problem is present.
The second stage involved modelling occurrence of problem, using
logistic regression, and separately modeling the level data using
ordinary regression.

Finally, the third stage should be combining the two models in order to
estimate the expected "level of the problem" for a specific set of
values of possible predictors.

My question is how to do this? Is it just enough to multiple
probabilities using conditional probabilities rule? Does STATA have a
modul for solving two-part models?

Thanks,
Tash

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```