Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: AW: Two part model - How to combine models?


From   "Baumeister Sebastian" <Baumeister@ift.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: AW: Two part model - How to combine models?
Date   Wed, 22 Aug 2007 20:30:17 +0200

You can combine the estimates to calculate the difference in two groups using bootstrapping technique as illustrated in Afifi, Ettner et al, Annu Rev Public Health.2007;28:95-111	

The steps involved are:

global indv male nonwite age income educyrs ins1 ins2
global pre income
gen orig=$pre

program define tpm 

	*Run logit for any problem (eg, physican visit)
 	Logit any $indv 

	*Calculate predicted probability at original regressor values
	Preict p,p		

	Egen sdinc=sd(income)
	Egen meaninc=mean(income)
	Gen ave=mean
	Gen high=mean+sd

	*Calculate the predicted probability when continous predictor (income) 	*is set to mean (or binary predictor (gender) to 0)

	Replace $pre=ave
	Predict p0, p

	* Calculate the predicted probability when continous predictor 	*(income) is *set to mean plus one sd

	Replace $pre=high
	Predict p1, p

	*Calculate relative risk associated with increase in income from mean 	*to mean+1sd

	gen rr = p1/p0

	* Reset income back to original value
       replace $preg = orig
	* Conditional linear regression of outpatient costs among those with 	* any visits
	  reg $depv $indv if any==1

	*You will also have to consider log-transformation and smearing here 	*(see Afifi, Ettner et al, Annu Rev Public *Health.2007;28:95-111 or Manning, J Health Econ; 17: 283-95)	

	* Calculate conditional expectation at original regressor values
        predict xb, xb
      * Calculate conditional expectation when income is set to the mean
         replace $preg=ave
         predict xb0, xb
       * Calculate conditional expectation when income is set to the mean         *plus one standard deviation
replace $preg=high
predict xb1, xb

* Calculate the difference in the conditional expectations (income set to mean vs. mean + SD)
gen cdiff = xb1-xb0

* Reset income back to original value
replace $preg = orig


* Calculate unconditional expectation at original regressor values
gen pred = p*xb
* Calculate unconditional expectation when income is set to the mean
gen pred0 = p0*xb0
* Calculate unconditional expectation when income is set to the mean + 1 SD
gen pred1 = p1*xb1
* Compare means of actual visits and unconditional predicted visits at original regressor values
sum $depv pred
* Calculate difference in unconditional expectation for income at mean vs. mean + 1 SD
gen udiff = pred1 ¨C pred0
* Do unconditional regression to reset sample to full sample for bootstrapping
quietly reg $depv $indv
end


* Call the program and make sure it works
tpm
* Look at the key results
su rr cdiff udiff

* CREATE PROGRAM TO CALL TPM AND DO THE BOOTSTRAPPING
program tpmboot, rclass

* Call the program to run the two-part model and get the point estimates of interest

tpm
tempname y1
sum rr, meanonly
scalar `y1' = r(mean)
return scalar rrboot=`y1¡¯
tempname y2
sum cdiff, meanonly

return scalar cdiffboot=`y2¡¯
tempname y3
sum udiff, meanonly
scalar `y3' = r(mean)
return scalar udiffboot=`y3¡¯
end
use newdatasetname, clear

* Set the seed first so that you can replicate the results if you rerun it later
set seed 8

* Call the bootstrap program, specifying the number of repetitions
* In a real study, you should use ¡Ý1000 repetitions if you want empirical confidence intervals
* Specify a small number of repetitions first to test the program
bootstrap ¡°tpmboot¡± r1=r(rrboot) r2=r(cdiffboot) r3=r(udiffboot), reps(10)



-----Urspr¨¹ngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Tamara Pejovic
Gesendet: Mittwoch, 22. August 2007 15:40
An: statalist@hsphsun2.harvard.edu
Betreff: st: Two part model - How to combine models?

Hi,

I have a quick question. I have a response variables that is positively 
skewed and contain a substantial proportion of zeroes. Since a common 
method for analyzing this type of data is a two-part model I have the 
analysis of three stages:
The first involved creating two sets of data from the original: one 
showing whether or not the problem is present and the other indicating 
the "level of the problem" when problem is present.
The second stage involved modelling occurrence of problem, using 
logistic regression, and separately modeling the level data using 
ordinary regression.

Finally, the third stage should be combining the two models in order to 
estimate the expected "level of the problem" for a specific set of 
values of possible predictors.

My question is how to do this? Is it just enough to multiple 
probabilities using conditional probabilities rule? Does STATA have a 
modul for solving two-part models?

Thanks,
Tash



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index