Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: plotting a regression function with time-dummies indicating structural breaks


From   "Allan Reese (Cefas)" <allan.reese@cefas.co.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: plotting a regression function with time-dummies indicating structural breaks
Date   Tue, 10 May 2011 14:25:58 +0100

Back in March I had a similar problem to Oliver Jones and discussed
off-list with Nick Cox 
(see below, reading from end). We let it drop, but it clearly confuses
more users than just 
me that "graph twoway function" operates without the data matrix.  Using
names of variables
in the function to be plotted produces unexpected results. 

Maybe the developers could look at the parsing and issue warning
messages, and at the 
documentation to clarify that italic "x" is a literal and not (like
"exp" and "range") 
a token to be substituted. The better approach, when the X axis is a
data variable, is 
Maarten's suggestion of generating points on the line and -connect-ing
them.

Allan

R Allan Reese
Senior statistician, Cefas
The Nothe, Weymouth DT4 8UB



--- Oliver Jones asked (statalist 3 May)
>> I wonder if there is an easy way to plot a regression of the form:
>>
>> y_t  b_0 + b_1*TIME_t + Dummy_1*(b_2 + b_3*TIME_t) + b_4*Dummy_2
[where b_0 etc are scalars and Dummy_1 etc are variables]
>>
>> So far I tried:
>> gen x  TIME_t
>> twoway ///
>>  (tsline y) ///
>>  (function y  b_0 + b_1*x + Dummy_1*(b_2 + b_3*x) + b_4*Dummy_2, ///
>>  range(`first_year' `last_year'))

& Maarten buis replied:
> I tend to plot such graphs by first using -predict- or -adjust- to
> get (adjusted) predictions and than use -twoway line-. It is just
> too easy to make a typo in this type of -twoway function- calls.
-------------------------------------------------------------------




-----
From: Allan Reese (Cefas) [mailto:allan.reese@cefas.co.uk]
Sent: 18 March 2011 13:50
To: Nick Cox
Subject: RE: Bug in "twoway function" (v 11.1 used)

It's a nice exercise for the little grey cells - years of training that
f(x) is just a function 
led to assumption I should substitute the "x" variable in the function
call. Expression parsing 
should allow the program to determine that f(x) with no x term should be
constant. As it's not 
necessarily evaluating f() where the data has values and -price- is not
a function, I would agree 
with your last point.

Allan 

----
From: Nick Cox [mailto:n.j.cox@durham.ac.uk]
Sent: 18 March 2011 13:36
To: Allan Reese (Cefas); 'Stata Technical Support'
Subject: RE: Bug in "twoway function" (v 11.1 used)

A better answer, perhaps, is that Stata takes your RHS in

[LHS ] RHS

and evaluates it, after substituting for any x with a grid of points on
[0,1].

If you rescale the range, it does so too.

The surprise is that it doesn't actually throw you out if there is no
reference to x.

Nick
n.j.cox@durham.ac.uk

----
From: Nick Cox
Sent: 18 March 2011 13:16
To: 'Allan Reese (Cefas)'; Stata Technical Support
Subject: RE: Bug in "twoway function" (v 11.1 used)

I think the question is whether this is Stata's misfeature or yours.

Perhaps -twoway function- should reject your syntax whenever it doesn't
use -x-, which is 
explicitly expected, but it does the best it can to make sense of what
you said. The effect 
appears to be equivalent to

gen obs  _n
scatter price obs

but scaled to the range of price.

Evidently your syntax is not illegal, otherwise it would have been
rejected, but it does not 
produce anything useful.

I think this is on the level of if you ask a strange question, you may
get a puzzled reply!

Nick
n.j.cox@durham.ac.uk

----
From: Allan Reese (Cefas) [mailto:allan.reese@cefas.co.uk]
Sent: 18 March 2011 11:11
To: Stata Technical Support
Cc: Nick Cox
Subject: Bug in "twoway function" (v 11.1 used)

Documentation for -twoway function- states:
You type -twoway function ysqrt(x)-
It makes no difference whether y and x are variables in your data.

Unfortunately it does. Using the auto data, compare

. scatter weight price || function yx, range(price) n(20)

. scatter weight price || function yprice, range(price) n(20)

. scatter weight price || function yprice, range(price)

Copy price as "x" and plot yx and it works, so what was meant appears
to be that you -must- define the f(x) in terms of the dummy "x" and a
variable called "x" in the dataset will not be used. But it is not
obvious to me how price is used in the second and third examples. As
there is no x, the function ought to be constant.

It's also confusing to write that the "... in range play[s] no part
unless option range(varname) is specified." The -in range- selects
cases.

Of course, only an idiot would not understand f(x)...
Allan


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index