Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Knot optimized logistic regression


From   Maarten buis <[email protected]>
To   [email protected]
Subject   Re: st: Knot optimized logistic regression
Date   Fri, 29 Jan 2010 08:51:37 +0000 (GMT)

--- On Fri, 29/1/10, Roger Harbord wrote:
> Surely that depends on the size of the dataset too? If
> Dan's outcome variable has thousands of successes and
> thousands of failures then i'd have thought there might
> be enough information to estimate the position of a
> single knot in a linear spline, if not in a cubic
> spline. Agree there's no built-in command to fit this (nor
> user-written one AFAIK) so it would take a bit of
> programming in -ml- (though you could call -mkspline- and
> -logit- to do most of the work).

Some models are identified in theory but demand so much of 
the data that in practice they will virtually always fail to 
converge. I would classify the logistic with an unknown knot 
location in this category. 

Anyhow, the programming is not such a big deal. Below is the
the likelihood if you want to give it a try:

program define splogit_lf
	version 8.2
	args lnf xb b1 b2 k
	tempvar eta
	qui gen double `eta' = `xb' + `b1'*$sp + `b2'*max($sp-`k', 0)
	qui replace `lnf' = ln(invlogit(`eta')) if $ML_y1 == 1
	qui replace `lnf' = ln(invlogit(-`eta')) if $ML_y1 == 0
end

Here `xb' captures the control variables, `b1' the effect of 
the first spline term and `b2' the effect of the second spline 
term, `k' is the location of the knot. The variable that you 
want to turn into a spline has to be stored before the program 
in the global sp.

Alternatively, you could do a grid search, like in the example
below. You can see why this is such a hard problem, the 
likelihood function appears to be bi-model and there seems
to be a considerable platteau near the global maximum.

*-------------- begin example ----------------------
sysuse nlsw88, clear

sum wage, detail
local begin = r(p5)
local end = r(p95)
local step = (`end' - `begin')/101
tempvar ll k
gen double `ll' = .
gen double `k' = .
local j = 1
forvalues i = `begin'(`step')`end' {
	capture drop sp*
	mkspline sp1 `i' sp2 = wage
	logit union south sp*
	replace `ll' = e(ll) in `j'
	replace `k' = `i' in `j++'
}

twoway line `ll' `k'
*-------------------- end example ------------------

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index