Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: simulated data for logistic regression... remedial algebra help?


From   "Daniel Waxman" <dan@amplecat.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: simulated data for logistic regression... remedial algebra help?
Date   Sun, 18 Nov 2007 23:56:34 -0500

Not sure if the lack of response reflects a lack of clarity in stating the
problem, or just that the solution isn't obvious to anyone.  

In case it is the former, I'll restate the problem:

The goal is to create a simulated data set.  To do so, I would like to
determine the probabilities of an outcome (death) given a positive or
negative test result, when the overall mortality rate, the odds ratio for
mortality as a function of that test and proportion of positive test results
in the population are known.

The problem reduces (I believe) to solving:

p1/(1-p1)=2(p2/(1-p2)
p2= k1-p1k2

where:

p1 = mortality rate for a positive test
p2 = mortality rate for a negative test

k1 = constant = (overall mortality rate)/(proportion of population with a
positive test)
k2 = constant = (proportion negative test)/(proportion positive test)

Does this problem look familiar to anyone? 

If my poor math skills are not failing me, I believe that I end up with p1^3
term.  Does this sound right?  Would it mean that there is no exact
solution?

Any other suggestions for creating simulated data with these properties?

Dan


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Daniel Waxman
Sent: Saturday, November 17, 2007 6:44 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: simulated data for logistic regression... remedial algebra
help? 

I am trying to create a series of simulated data sets for use in logistic
regression with the following properties:

Mortality (outcome) remains constant.   There is a single dichotomous
independent variable whose odds ratio (coefficient) and proportion of
positives can vary between the sets.  It all comes down to solving for the
intercept (`b0'), given the following relationships:

probability_negative=invlogit(`b0’)
probability_positive=invlogit(log(`odds’)+`b0’)
`proportion_positive’*probability_positive+(1-`proportion_positive’)*probabi
lity_negative=`mortality’

Sad to admit, but I am bumping up against the limitations of my algebra
skills.   
I'd imagine this is trivial for many of you...


i.e.:

************** 

clear
set obs 1000
local odds=2
local proportion_positive= .10
local mortality = .05

gen test=uniform()<`proportion_positive’

/*

************solve for `b0' here************

*/

gen probability_negative=invlogit(`b0’)
gen probability_positive=invlogit(log(`odds’)+`b0’)

gen died=uniform() < cond(test==0,probability_negative,probability_positive)

logistic died test

************************

Thanks.

Dan

 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.503 / Virus Database: 269.16.0/1136 - Release Date: 11/17/2007
2:55 PM
 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index