[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Richard Williams <[email protected]> |

To |
[email protected], [email protected] |

Subject |
Re: st: logistic ---- assessment of model fit via external validation |

Date |
Wed, 09 Jun 2004 13:33:37 -0500 |

At 12:06 PM 6/9/2004 -0600, [email protected] wrote:

I'm not sure if this is what they had in mind, but this might work. First, you need to install Nick Cox's -swor- routine.Hello All --- I'm using Intercooled Stata, 8.2. In Hosmer & Lemeshow's "Applied Logistic Regression", the authors indicate that it may be possible to exclude a subsample of observations, develop a model, then test the model on the excluded observations (p.171). I'm interested in doing just that, although I'm at a loss as to if -- and how -- something like this can be implemented in Stata. I've searched the Statalist archives, UCLA's statistics portal, and a few different textbooks for hints on how this 'external validation' can be implemented, but to no avail. Note that I do not want to generate new coefficients for the second sub-sample, rather, I want to use the coefficients generated from the first sub-sample in estimating a classification table for the second sub-sample.

. sysuse auto

(1978 Automobile Data)

. set seed 123

. swor 37, gen(mysample) keep

. quietly logit foreign price if mysample

. * fit using selected cases

. lstat

Logistic model for foreign

-------- True --------

Classified | D ~D | Total

-----------+--------------------------+-----------

+ | 0 1 | 1

- | 13 23 | 36

-----------+--------------------------+-----------

Total | 13 24 | 37

Classified + if predicted Pr(D) >= .5

True D defined as foreign != 0

--------------------------------------------------

Sensitivity Pr( +| D) 0.00%

Specificity Pr( -|~D) 95.83%

Positive predictive value Pr( D| +) 0.00%

Negative predictive value Pr(~D| -) 63.89%

--------------------------------------------------

False + rate for true ~D Pr( +|~D) 4.17%

False - rate for true D Pr( -| D) 100.00%

False + rate for classified + Pr(~D| +) 100.00%

False - rate for classified - Pr( D| -) 36.11%

--------------------------------------------------

Correctly classified 62.16%

--------------------------------------------------

. drop if mysample

(37 observations deleted)

. * fit using non-selected cases

. lstat, all

Logistic model for foreign

-------- True --------

Classified | D ~D | Total

-----------+--------------------------+-----------

+ | 1 3 | 4

- | 8 25 | 33

-----------+--------------------------+-----------

Total | 9 28 | 37

Classified + if predicted Pr(D) >= .5

True D defined as foreign != 0

--------------------------------------------------

Sensitivity Pr( +| D) 11.11%

Specificity Pr( -|~D) 89.29%

Positive predictive value Pr( D| +) 25.00%

Negative predictive value Pr(~D| -) 75.76%

--------------------------------------------------

False + rate for true ~D Pr( +|~D) 10.71%

False - rate for true D Pr( -| D) 88.89%

False + rate for classified + Pr(~D| +) 75.00%

False - rate for classified - Pr( D| -) 24.24%

--------------------------------------------------

Correctly classified 70.27%

--------------------------------------------------

.

I think this does what you have said you want, but whether it is the best way to proceed (or what H & L really had in mind) i don't know.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:

- Prev by Date:
**st: svy commands and missing cases** - Next by Date:
**Re: st: logistic ---- assessment of model fit via external validation** - Previous by thread:
**st: logistic ---- assessment of model fit via external validation** - Next by thread:
**Re: st: logistic ---- assessment of model fit via external validation** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |