Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: correlate lag variables

From   "Nick Cox" <>
To   <>
Subject   st: RE: AW: correlate lag variables
Date   Mon, 10 May 2010 11:38:30 +0100

The reason for differences is that -correlate- will only correlate variables for observations for which _all_ variables specified are non-missing. As Martin is implying, -pwcorr- is more indulgent, which is not necessarily a feature. 

The output for -correlate- made it clear that different numbers of observations were being used. 

At a guess, Julia's data are panel data, so every extra lag bites hard, meaning that for any increase in lag by 1, one more observation is necessarily lost at the end of each panel. So, the last observation in each panel cannot be used with lag one, the previous one with lag two, and so forth. 


Martin Weiss

Try -pwcorr- instead:

set obs 100
gen y=1
replace y =.6*y[_n-1]+rnormal() in 2/l
gen byte time=_n
tsset time
corr y L.y L2.y
pwcorr y L.y
pwcorr y L.y L2.y


I would like to calculate the correlation between a variable and its
past values. Thus, I use the following command:

. correlate BI L1.BI L2.BI

             |           L.      L2.
             | BI       BI      BI
         --. |   1.0000
         L1. |   0.0111   1.0000
         L2. |   0.0647   0.0161   1.0000

However, if I only ask the correlation for the first lag, my result

. correlate BI L1.BI

             |             L.
             |    BI     BI
         --. |   1.0000
         L1. |   0.0174   1.0000

 Why does excluding the second lag affect the correlation between the
variable and its first lag?

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index