Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: nl hockey estimation


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: nl hockey estimation
Date   Fri, 10 Aug 2012 13:18:22 -0400

Quite right, Nick. I was confused by the "y" on the RHS
of your y2 equation.

I meant to write
gen  y = cond(x<50, x, 100 -x) 

which _is_ a segmented line.

I'd just note that your y2 is perhaps easier to 
understand as:

gen y2 = cond(x<50, x, 3*x-100)


Al, I gave the earliest reference I knew to the "hockey stick" terminology
 problem in:

http://www.stata.com/statalist/archive/2010-04/msg01712.html


nlhockey.ado is part of the loghockey  package. One can see this and others that 
Mark Lunt has written by typing: 
"net from http://personalpages.manchester.ac.uk/staff/mark.lunt/";

Steve
[email protected]

On Aug 10, 2012, at 10:49 AM, Nick Cox wrote:

No; on this occasion I meant what I wrote. What you suggest shows
another discontinuity.

Nick

On Fri, Aug 10, 2012 at 3:42 PM, Steve Samuels <[email protected]> wrote:
> 
> Nick, I think you meant:
> 
> gen y2 = cond(x <50, y, 100 - y)
> 
> Steve
> [email protected]
> 
> On Aug 9, 2012, at 9:09 PM, Nick Cox wrote:
> 
> The word "clearly" here is questionable. Your test data show a big
> discontinuity; they aren't a segmented line which is what the model is
> looking for. The least squares criterion is being used and -nl- does
> the best it can to minimise the sum of _squared_ errors. The built-in
> aversion to very large errors is what is biting here.
> 
> If you work instead with
> 
> gen y2 = cond(x < 50, y, y - 100)
> nl hockey y2 x
> 
> you will get what you expect.
> 
> On this evidence the program is fine, but your test example won't work
> as you expect under LS. At a wild guess, L1-norm might give something
> nearer splitting the difference.
> 
> Nick
> 
> On Fri, Aug 10, 2012 at 12:34 AM, Jordan Silberman
> <[email protected]> wrote:
> 
>> I'm attempting to identify a breakpoint in a regression using the -nl
>> hockey- command (described here:
>> http://personalpages.manchester.ac.uk/staff/mark.lunt/nlhockey.hlp).
>> 
>> When I test this command using simple simulated data, I find that the
>> command doesn't identify the correct breakpoint. Here's an example:
>> 
>> set obs 100
>> gen x = _n
>> gen y = x if x < 50
>> replace y = x*3 if x > 49
>> nl hockey y x
>> 
>> The breakpoint should clearly be at 50; however, command output
>> identifies the breakpoint at 32.7.
>> 
>> So, 2 questions:
>> 
>> 1. Why might the -nl hockey- command be computing the wrong breakpoint?
>> 
>> 2. Can anyone recommend an alternate approach to identifying the
>> breakpoint in a 2-piece regression? Best would be something that's
>> been implemented in Stata in a straightforward way.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index