[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: nl hockey estimation
Nick Cox <firstname.lastname@example.org>
Re: st: nl hockey estimation
Fri, 10 Aug 2012 18:28:52 +0100
You are quite right in terms of understanding the definition from scratch.
My point was that Jordan's example had the right-hand part translated
up by 100 relative to the left-hand part, and my code was a fix of
On Fri, Aug 10, 2012 at 6:18 PM, Steve Samuels <email@example.com> wrote:
> Quite right, Nick. I was confused by the "y" on the RHS
> of your y2 equation.
> I meant to write
> gen y = cond(x<50, x, 100 -x)
> which _is_ a segmented line.
> I'd just note that your y2 is perhaps easier to
> understand as:
> gen y2 = cond(x<50, x, 3*x-100)
> Al, I gave the earliest reference I knew to the "hockey stick" terminology
> problem in:
> nlhockey.ado is part of the loghockey package. One can see this and others that
> Mark Lunt has written by typing:
> "net from http://personalpages.manchester.ac.uk/staff/mark.lunt/"
> On Aug 10, 2012, at 10:49 AM, Nick Cox wrote:
> No; on this occasion I meant what I wrote. What you suggest shows
> another discontinuity.
> On Fri, Aug 10, 2012 at 3:42 PM, Steve Samuels <firstname.lastname@example.org> wrote:
>> Nick, I think you meant:
>> gen y2 = cond(x <50, y, 100 - y)
>> On Aug 9, 2012, at 9:09 PM, Nick Cox wrote:
>> The word "clearly" here is questionable. Your test data show a big
>> discontinuity; they aren't a segmented line which is what the model is
>> looking for. The least squares criterion is being used and -nl- does
>> the best it can to minimise the sum of _squared_ errors. The built-in
>> aversion to very large errors is what is biting here.
>> If you work instead with
>> gen y2 = cond(x < 50, y, y - 100)
>> nl hockey y2 x
>> you will get what you expect.
>> On this evidence the program is fine, but your test example won't work
>> as you expect under LS. At a wild guess, L1-norm might give something
>> nearer splitting the difference.
>> On Fri, Aug 10, 2012 at 12:34 AM, Jordan Silberman
>> <email@example.com> wrote:
>>> I'm attempting to identify a breakpoint in a regression using the -nl
>>> hockey- command (described here:
>>> When I test this command using simple simulated data, I find that the
>>> command doesn't identify the correct breakpoint. Here's an example:
>>> set obs 100
>>> gen x = _n
>>> gen y = x if x < 50
>>> replace y = x*3 if x > 49
>>> nl hockey y x
>>> The breakpoint should clearly be at 50; however, command output
>>> identifies the breakpoint at 32.7.
>>> So, 2 questions:
>>> 1. Why might the -nl hockey- command be computing the wrong breakpoint?
>>> 2. Can anyone recommend an alternate approach to identifying the
>>> breakpoint in a 2-piece regression? Best would be something that's
>>> been implemented in Stata in a straightforward way.
* For searches and help try: