[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Knot optimized logistic regression

From	Dan MacNulty <[email protected]>
To	[email protected]
Subject	Re: st: Knot optimized logistic regression
Date	Fri, 29 Jan 2010 08:55:32 -0800

Maarten,

Thanks very much for sharing your code. There is one added twist: mydata are clustered within subject, so I'm estimating a random-interceptlogistic spline regression. I assume the addition of a random effectwill make knot estimation even more difficult, correct?


Dan

Maarten buis wrote:

--- On Fri, 29/1/10, Roger Harbord wrote:

Surely that depends on the size of the dataset too? If
Dan's outcome variable has thousands of successes and
thousands of failures then i'd have thought there might
be enough information to estimate the position of a
single knot in a linear spline, if not in a cubic
spline. Agree there's no built-in command to fit this (nor
user-written one AFAIK) so it would take a bit of
programming in -ml- (though you could call -mkspline- and
-logit- to do most of the work).

Some models are identified in theory but demand so much ofthe data that in practice they will virtually always fail toconverge. I would classify the logistic with an unknown knotlocation in this category.

Anyhow, the programming is not such a big deal. Below is the
the likelihood if you want to give it a try:

program define splogit_lf
	version 8.2
	args lnf xb b1 b2 k
	tempvar eta
	qui gen double `eta' = `xb' + `b1'*$sp + `b2'*max($sp-`k', 0)
	qui replace `lnf' = ln(invlogit(`eta')) if $ML_y1 == 1
	qui replace `lnf' = ln(invlogit(-`eta')) if $ML_y1 == 0
end

Here `xb' captures the control variables, `b1' the effect ofthe first spline term and `b2' the effect of the second splineterm, `k' is the location of the knot. The variable that youwant to turn into a spline has to be stored before the programin the global sp.


Alternatively, you could do a grid search, like in the example

below. You can see why this is such a hard problem, thelikelihood function appears to be bi-model and there seems

to be a considerable platteau near the global maximum.

*-------------- begin example ----------------------
sysuse nlsw88, clear

sum wage, detail
local begin = r(p5)
local end = r(p95)
local step = (`end' - `begin')/101
tempvar ll k
gen double `ll' = .
gen double `k' = .
local j = 1
forvalues i = `begin'(`step')`end' {
	capture drop sp*
	mkspline sp1 `i' sp2 = wage
	logit union south sp*
	replace `ll' = e(ll) in `j'
	replace `k' = `i' in `j++'
}

twoway line `ll' `k'
*-------------------- end example ------------------

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Knot optimized logistic regression
  - From: Maarten buis <[email protected]>

References:
- Re: st: Knot optimized logistic regression
  - From: Maarten buis <[email protected]>

Prev by Date: RE: st: Crossed wires? Table set wrong? (table command, rel 10)
Next by Date: Re: st: Knot optimized logistic regression
Previous by thread: Re: st: Knot optimized logistic regression
Next by thread: Re: st: Knot optimized logistic regression
Index(es):
- Date
- Thread