Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Why does this scalar calculation return the wrong value when using time series operators?

From	Aaron Kirkman <[email protected]>
To	[email protected]
Subject	Re: st: Why does this scalar calculation return the wrong value when using time series operators?
Date	Wed, 7 Nov 2012 13:16:52 -0600

Hi Steve/Nick,

Thank you to both of you for linking to this Stata Journal article. It
clears up a few of the misconceptions I had, and changing the scalar
name to something unambiguous, e.g. --sc_tstat-- solves the problem.

Aaron

On Tue, Nov 6, 2012 at 4:07 PM, Nick Cox <[email protected]> wrote:
> A U.S.-based Stata friend privately queried "plumps for" as "British slang?"
>
> I wouldn't want to be obscure, so should spell out that "plumps for"
> means "chooses" in this context.
>
> Last I heard the language was called "English"....
>
> On Tue, Nov 6, 2012 at 9:45 PM, Nick Cox <[email protected]> wrote:
>> The short answer is that your "scalar calculation" is no such thing.
>> You are asking to
>>
>> . di t
>>
>> -- thinking that scalar t will be displayed --
>>
>> but Stata has three rules that together cause this to do something different.
>>
>> First off, variables and scalars share the same namespace.
>>
>> Second, if there's ambiguity Stata plumps for the variable name interpretation.
>>
>> Third, if asked to display a variable, -display- tries its best and
>> its best is varname[1], here t[1], here 1.
>>
>> See also
>>
>> SJ-6-2  dm0021  . Stata tip 31: Scalar or variable? Problem of ambiguous names
>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  G. I. Kolev
>>         Q2/06   SJ 6(2):279--280                                 (no commands)
>>         tips for avoiding abbreviation conflicts with variables
>>         when naming scalars
>>
>> You're in very good company in being puzzled by this, as only a
>> concatenation of circumstances explains it all.
>>
>> Nick
>>
>> On Tue, Nov 6, 2012 at 9:25 PM, Aaron Kirkman <[email protected]> wrote:
>>
>>> I'm performing a simple linear regression on time series data and
>>> calculating the t-statistic for coefficients afterwards. However, I
>>> noticed that when using time series operators, the t-statistic always
>>> calculates to be one, even though the values from the regression are
>>> correct. For example, this code:
>>>
>>> ----------
>>> clear all
>>>
>>> set seed Xc0114d4971eea6310add269363d61a6d00042c5f
>>> local y0 0
>>> set obs 200
>>> quietly {
>>>      gen y = .
>>>      gen t = _n
>>>      tsset t
>>>
>>>      replace y = cond(t == 1, `y0', L.y + rnormal())
>>>      regress D.y L.y
>>> }
>>>
>>> di _b[L.y]
>>> di _se[L.y]
>>> di _b[L.y] / _se[L.y]
>>>
>>> scalar t = _b[L.y] / _se[L.y] // t = Beta / SE
>>>
>>> di t
>>> ----------
>>>
>>> outputs the following:
>>>
>>> -.02092465     // _b[L.y]
>>> .01391362      // _se[L.y]
>>> -1.5038971     // _b[L.y] / _se[L.y]
>>> 1                   // scalar "t"
>>>
>>> The first three numbers are the correct values from the regression,
>>> but the calculation for the t-statistic is incorrect. If I remove the
>>> time series operators from the code and instead refer to observations
>>> numbers (I would prefer to use time series operators, but just as an
>>> example), the resulting t-statistic is correct:
>>>
>>> ----------
>>> clear all
>>> set seed Xc0114d4971eea6310add269363d61a6d00042c5f
>>> local y0 0
>>> set obs 200
>>> quietly {
>>>     gen y = .
>>>     gen ly = .
>>>     gen dy = .
>>>
>>>     replace y = cond(_n == 1, `y0', y[_n - 1] + rnormal())
>>>     replace ly = y[_n - 1]
>>>     replace dy = y - ly
>>>
>>>     regress dy ly
>>> }
>>>
>>> di _b[ly]
>>> di _se[ly]
>>> di _b[ly] / _se[ly]
>>>
>>> scalar t = _b[ly] / _se[ly] // t = Beta / SE
>>> di t
>>> ----------
>>>
>>> This code outputs the correct t-statistic of  -1.5038971
>>>
>>> -.02092465     // _b[L.y]
>>> .01391362      // _se[L.y]
>>> -1.5038971     // _b[L.y] / _se[L.y]
>>> -1.5038971     // scalar "t"
>>>
>>>
>>> I read through "[U] 13.5 Accessing coefficients and standard errors"
>>> and --help scalar--, but I don't see anything in either of those
>>> manuals that would cause the problem. Any ideas?
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Why does this scalar calculation return the wrong value when using time series operators?
  - From: Aaron Kirkman <[email protected]>
- Re: st: Why does this scalar calculation return the wrong value when using time series operators?
  - From: Nick Cox <[email protected]>
- Re: st: Why does this scalar calculation return the wrong value when using time series operators?
  - From: Nick Cox <[email protected]>

Prev by Date: st: Standard errors on structural VAR coefficients
Next by Date: st: How do I drop incomplete observations after running multiple imputation?
Previous by thread: Re: st: Why does this scalar calculation return the wrong value when using time series operators?
Next by thread: st: Problems with merge and updating missing data
Index(es):
- Date
- Thread