Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: # of Obs. in -stcox- result |

Date |
Wed, 12 Oct 2011 17:24:13 -0400 |

Muyang-- The Statalist FAQ, which you were asked to read when you joined the list, state: "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly! If you can, reproduce the error with one of Stata's provided datasets or a simple concocted dataset that you include in your posting." Here what you've typed would include the -stset-, -stdes-, -stcox-, and -reg- commands. What Stata typed would be the results of those commands. When we see these, perhaps we can give more specific answers. Steve On Oct 12, 2011, at 3:11 AM, Maarten Buis wrote: On Wed, Oct 12, 2011 at 5:04 AM, Muyang Zhang wrote: > This is the regular case in all kinds of analysis. My problem is that > the number of observations reported by -stset- is smaller than that > using a linear model with the same set of covariates. With survival analysis the number of rows in your dataset does not have to be the same as the number of observations. To be precise the same person/firm/cow/whatever can appear multiple times. You tell Stata which rows together form one observation with the -id()- option. The logic is that survival analysis allows the value on explanatory variables to change over time, but in order to allow that you need to have a datastructure in which you can store those changing values. The multiple rows represent the same observation at different time points, thus enabling one to record those changing values. Linear regression on the other hand assumes that every row in your dataset is one observation. Moreover, to make things more complex, if any covariate on any row belonging to one observation is missing, the entire observation, i.e. all rows belonging to that observation, will be ignored by survival analysis commands, while commands like linear regression will only ignore the rows with missing values. Another reason for the difference in the number of observations could be that survival analysis will ignore negative times or times of zero, while linear regression has no problem with those. So you really cannot use linear regression in this context to find out where the dropped observations in survival analysis come from. Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: # of Obs. in -stcox- result***From:*Muyang Zhang <zmyfudan@gmail.com>

**Re: st: # of Obs. in -stcox- result***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: # of Obs. in -stcox- result***From:*Muyang Zhang <zmyfudan@gmail.com>

**Re: st: # of Obs. in -stcox- result***From:*Maarten Buis <maartenlbuis@gmail.com>

- Prev by Date:
**Re: Other forums [was: RE: st: RE: how to force same excluded groups across regression models]** - Next by Date:
**Re: st: Use of collapse (sum) in Multiple Imputation** - Previous by thread:
**Re: st: # of Obs. in -stcox- result** - Next by thread:
**st: Stat Transfer Question** - Index(es):