Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

R: st: Re: Re: Estimating a model where the dependent variale is a ratio - Folllow up


From   "Carlo Lazzaro" <carlo.lazzaro@tin.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   R: st: Re: Re: Estimating a model where the dependent variale is a ratio - Folllow up
Date   Sat, 5 Sep 2009 09:13:56 +0200

Dear Johannes,
thanks a lot for your follow-up message.
Below, I try to reply point-to-point to your comments.

<You think I should rather estimate a proportional hazard model (e.g. 
stcox or streg)?>

Cox proportional hazard model may be the right choice assuming that you are
not willing to parameterize the baseline hazard. However, Cox proportional
model requires that the proportional assumption holds along your observation
time. You can check this assumption via the Schoenfeld residuals (please,
see - stcox - from within Stata. By the way, my release is 9.2/SE).

Parametric regression models assume that the baseline hazard has a specific
form. However, this may mean that different type of cancer fit with
different parametric regression models. Although I am a health economist and
not an epidemiologist, acute myeloid leukaemia probably has a baseline
hazard which differs from the baseline hazard of post-menopausal breast
cancer. 
Hence, you may have to perform different parametric regression models
separately for different type of cancer.

<I mean you are right, I have data for each individual about its survival 
time, county of residence, county where it was diagnosed, age at 
diagnoses, etc...>

And all these data are good for performing Survival Analysis.

<However I would like to see if in high volume counties (that are: 
counties where a lot of same cancer types i where detected) people 
(suffering from this cancer type i) survive longer.>

You can do this using - basesurv - option in Cox regression; - predict,
csurv - or -stcurve- (these command have different meanings; for more
details, please see Cleves MA, Gould WW, Gutierrez R. An Introduction To
Survival Analysis Using Stata. Revised edition. College Station: StataPress,
2004 or the 2008 edition of this textbook).
You should add a categorical variable for identifying countries with
different "cancer volume" included in your dataset. (please see Cleves MA,
Gould WW, Gutierrez R. An Introduction To Survival Analysis Using Stata.
Revised edition. College Station: StataPress, 2004: 158-160).

<So I think I have to group the data in the following way:
I have to calculate for each county the  average survival time of 
individuals  diagnosed with cancer type i
Or is this not necessary and I am loosing important information?>

I do not think this is necessary, provided that you have the time to event
(which is the core point of Survival Analysis) for each subject. Simply
-stset- your data before performing survival analysis.

HTH and Kind Regards,
Carlo

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Johannes Schoder
Inviato: venerdì 4 settembre 2009 20.29
A: statalist@hsphsun2.harvard.edu
Oggetto: Re: st: Re: Re: Estimating a model where the dependent variale is a
ratio - Folllow up

Thanks Martin,
Thanks Joseph,
Thanks Carlo,  your comments helped a lot!!


Just two things:
A.
Concerning Josephs reply:
I tried the following for one year (e.g. year 2000):

glm A C, i (count) family (binomial B)
and I get the result my supervisor wanted me to get.
However, why are you suggesting xtgee?
I assume that you meant that I perform my estimation for several years.
I might try that later, too.


B.
Concerning Carlos comment:
You think I should rather estimate a proportional hazard model (e.g. 
stcox or streg)?
I mean you are right, I have data for each individual about its survival 
time, county of residence, county where it was diagnosed, age at 
diagnoses, etc...
However I would like to see if in high volume counties (that are: 
counties where a lot of same cancer types i where detected) people 
(suffering from this cancer type i) survive longer.
So I think I have to group the data in the following way:
I have to calculate for each county the  average survival time of 
individuals  diagnosed with cancer type i
Or is this not necessary and I am loosing important information?

Again thanks a lot for your comments!
Best,
Johannes


Joseph Coveney schrieb:
> I wrote:
>
> >From the SAS code that you've shown, your major professor seems to be
fitting a
> linear model.  
>
>
----------------------------------------------------------------------------
----
>
> I mistook, if I remember correctly now.  The PROC GENMOD syntax (MODEL A /
B =
> C;) that you showed *is* for a logistic model for which the corresponding
Stata
> command is (as was shown):
>
> xtgee A C, i(country) family(binomial B)
>
> Joseph Coveney
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>   

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index