Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: nl hockey estimation

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: nl hockey estimation Date Fri, 10 Aug 2012 02:09:01 +0100

```The word "clearly" here is questionable. Your test data show a big
discontinuity; they aren't a segmented line which is what the model is
looking for. The least squares criterion is being used and -nl- does
the best it can to minimise the sum of _squared_ errors. The built-in
aversion to very large errors is what is biting here.

gen y2 = cond(x < 50, y, y - 100)
nl hockey y2 x

you will get what you expect.

On this evidence the program is fine, but your test example won't work
as you expect under LS. At a wild guess, L1-norm might give something
nearer splitting the difference.

Nick

On Fri, Aug 10, 2012 at 12:34 AM, Jordan Silberman
<silberman.stata@gmail.com> wrote:

> I'm attempting to identify a breakpoint in a regression using the -nl
> hockey- command (described here:
> http://personalpages.manchester.ac.uk/staff/mark.lunt/nlhockey.hlp).
>
> When I test this command using simple simulated data, I find that the
> command doesn't identify the correct breakpoint. Here's an example:
>
> set obs 100
> gen x = _n
> gen y = x if x < 50
> replace y = x*3 if x > 49
> nl hockey y x
>
> The breakpoint should clearly be at 50; however, command output
> identifies the breakpoint at 32.7.
>
> So, 2 questions:
>
> 1. Why might the -nl hockey- command be computing the wrong breakpoint?
>
> 2. Can anyone recommend an alternate approach to identifying the
> breakpoint in a 2-piece regression? Best would be something that's
> been implemented in Stata in a straightforward way.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```