[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How low can the percentage of uncensored cases be in heckprob?

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: How low can the percentage of uncensored cases be in heckprob?
Date	Tue, 11 Nov 2008 16:17:48 -0500

I am not familiar with -heckprob-, but I doubt if the *percent* ofuncensored observations matters much.

-heckprob- fits two probit models. I know of results related toMargaret's question only for logit models. For a single logisticregression model, the relevant sample size is the smaller of thenumber of events or non-events. Peduzzi et al. (1996) showed that theratio of this number to the number of predictors should be at least15:1 to avoid bias from over-fitting.


-Steve

Refs:

Peduzzi PN, Concato J, Holford TR, Feinstein AR. (1995) Theimportance of events per independent variable in multivariableanalysis, II: accuracy and precision of regression estimates. J ClinEpidemiol; 48: 1503–10.

Peduzzi PN, Concato J, Kemper E, Holford TR, Feinstein AR. (1996) Asimulation study of the number of events per variable in logisticregression analysis. J Clin Epidemiol; 49: 1373–9.

M Babyak. (2004) What You See May Not Be What You Get: A Brief,Nontechnical Introduction to Overfitting in Regression-Type Models.Psychosomatic Medicine 66:411-421. Full text:

http://www.psychosomaticmedicine.org/cgi/content-nw/full/66/3/411/




On Nov 11, 2008, at 12:15 PM, Maarten buis wrote:

--- "Tyler, Margaret C D" <[email protected]> wrote:

In the example in the Stata reference -H heckprob, there are 95 total
and 59 uncensored observations, so 62% are uncensored. In my own
situation I have only about 19% uncensored. Is it still appropriate
to use heckprob for my analysis? I have run the equations and gotten
what seem to be valid results. rho is non-significant.


You are obviously pushing your luck with that many censored cases. It
is no longer very popular to make statements like you need at least N
observation or p% uncensored cases for technique t to be appropriate
(whatever appropriate may mean). So I don't think you will get the
answer you are looking for. However, what you can do is run some
simulations and see how well (or bad) your estimator behaves with a
small number of uncensored cases. At the last Summer North American
Stata Users' Group meeting I gave a talk on using Stata for doing this
type of simulations, you can get the materials from:
http://ideas.repec.org/p/boc/nsug08/14.html

Hope this helps,
Maarten


-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: How low can the percentage of uncensored cases be in heckprob?
  - From: Maarten buis <[email protected]>

References:
- Re: st: How low can the percentage of uncensored cases be in heckprob?
  - From: Maarten buis <[email protected]>

Prev by Date: st: Re: how to quietly svmat?
Next by Date: st: SSC archive updates
Previous by thread: Re: st: How low can the percentage of uncensored cases be in heckprob?
Next by thread: Re: st: How low can the percentage of uncensored cases be in heckprob?
Index(es):
- Date
- Thread