[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
moleps islon <moleps2@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Gene-incidence question/simulation |

Date |
Mon, 23 Mar 2009 11:15:41 +0100 |

Thanks for the statistical input. I truly appreciate this. However what I've done instead in order to get an estimate is to run a simulation whereby I select g random patients in my sample and "give them" the mutation and then do the usual calculations. From the regressionline I can then get a (probably statistically dubious) estimate of the cancerincidence in the face of having the mutation in the sample given that the people in my sample without the mutation has an incidence of 6/100000 for also sustaining cancer. the simulation runs like this in case of curiosity (just generating the observation-time here)... set obs 217 gen id=_n gen time=54*runiform() gen p=runiform() sort p gen cancer=id<12 stset time, id(id) f(cancer==1) simulate ratemutpos=y ratemutneg=o mutation=v, reps(100000):simulate ins scatter ratemutpos ratemutneg||lfitci ratemutpos ratemutneg tab mutation if ratemutneg<7 capture drop program ins program ins capture drop y capture drop o capture drop v local b=runiform()*100 gen l=runiform() sort l gen mutation=_n<`b' count if mutation==1 gen v=r(N) stptime if mutation==1, per(100000) gen y=r(rate) stptime if mutation==0,per(100000) gen o=r(rate) end On Sun, Mar 22, 2009 at 5:02 PM, Austin Nichols <austinnichols@gmail.com> wrote: > moleps islon <moleps2@gmail.com> : > Just to be clear: B causes Z and B causes A, but you don't observe B, > right? Let's ignore the survival model you are no doubt estimating, > and suppose you have gotten an estimate of P(Z|A)=.05 with a SE near > zero (a confidence interval of width zero). Now you want to estimate > P(Z|B) and P(A|B), and you think P(Z|B) is near .65 and > P(Z|~B)=6/100000 (I assume "background incidence" is the probability > of Z given not B here; that may reflect my "background ignorance"). > You will need much more information to make any progress! > > Let p=P(B) in the population, y=P(Z|B), x=P(A|B), and w=P(A|~B). Note > that ~B means "not B" or B==0. Then > > P(Z|A)=P(Z|B)P(B|A)+P(Z|~B)P(~B|A)=[ypx+.00006(1-p)w]/[(1-p)w+px] > > so even if you assume P(Z|A)=.05 and y=.65, you have 3 unknowns and 1 > equation; even if you know p, you have two unknowns w and x, so the > best you can hope for is to express P(A|B) as a linear function of > P(A|~B). For example, if p=.5 and y=.65 and P(Z|A)=.05 then w is 12 > times as big as x (i.e. if Z is so rare in a sample of A, when B so > likely causes Z, it must be because A is much more likely when not B > than when B). If p is 8% then w and x are roughly the same. I > suggest you draw out a couple of trees with probabilities and check my > math. > > If you want to estimate y and x, you are out of luck. If you know w > and p with certainty, you can express y as a function of x and the > estimate of P(Z|A), so if you have estimates of P(Z|A) in memory, you > can use -lincom- to get estimates of y conditional on x, but how > plausible is it you would know w with certainty when you are trying to > estimate x and y? > > I suppose you could use known p, estimates of P(Z|A) in memory, and > -lincom-, to get estimates of y conditional on x and w, then present a > table of point estimates and confidence intervals for various values > of x and w. Or get estimates of x conditional on y and w, or what > have you. But you still have to assume you know p with certainty, or > the dimension of that table gets out of control... > > I have been assuming that P(Z|A) is what you are estimating, but you > really have a competing risk model, I am guessing, modeling the hazard > of getting Z before death or censoring by some other process. So you > need to redefine Z to be not "gets condition Z" but "gets condition Z > in my observation period" to use any of the above, which is probably > unpalatable. Plus, I don't know if I've translated your description > into probabilities correctly--the jargon of genetics is unfamiliar to > me (and many other list members--you should translate to the common > language of statistics). > > On Sun, Mar 22, 2009 at 10:37 AM, moleps islon <moleps2@gmail.com> wrote: >> Dear statalisters, >> I'm studying a tumor A that has a probability (x) of a being linked to >> a genetic mutation (B) that also predisposes (penetrance approx 65%(y) >> by 70 years) to condition Z. Now I've got 217 cases of A that resulted >> in 11 cases of Z over 8534 years of followup years (among the 217 >> cases). I need to determine the number of patients with B given that >> there is also a background incidence of 6/100000 for Z.We know that >> x<<y. Besides running a simulation is there a more analytical way of >> estimating x and y given my data??? >> >> Best wishes, >> Moleps > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Gene-incidence question/simulation***From:*Neil Shephard <nshephard@gmail.com>

**AW: st: Gene-incidence question/simulation***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**References**:**st: Gene-incidence question/simulation***From:*moleps islon <moleps2@gmail.com>

**Re: st: Gene-incidence question/simulation***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**Re: st: three-way tables, exporting to tex** - Next by Date:
**AW: st: Gene-incidence question/simulation** - Previous by thread:
**Re: st: Gene-incidence question/simulation** - Next by thread:
**AW: st: Gene-incidence question/simulation** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |