Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: Panel data: large number of linear time trends

 From ron alfieri To statalist@hsphsun2.harvard.edu Subject Re: st: Panel data: large number of linear time trends Date Thu, 10 May 2012 23:53:47 -0400

Thanks Austin!

On Thu, May 10, 2012 at 10:05 AM, Austin Nichols
<austinnichols@gmail.com> wrote:
> ron alfieri <ron.alfieri18@gmail.com>
> You are using different samples in different detrending regressions.
> It is easy to constrain samples, though:
>
> clear all
> prog mydetrend, rclass byable(recall)
> version 10.1
> syntax varlist [if] [in], DETrend(varname)
> tempvar eps
> marksample touse
> regress `varlist' if `touse'
> predict double `eps' if e(sample), res
> replace `detrend' = `eps' if e(sample)
> end
>
> webuse grunfeld
> replace invest = . in 4
> replace invest = . in 6
> replace mvalue = . in 8
> replace mvalue = . in 13
> replace invest = . in 6
> replace invest = . in 7
> replace invest = . in 11
> replace invest = . in 15
> replace invest = . in 21
>
> g i_dtr = .
> g mv_dtr = .
> g m=mvalue if !mi(invest)
> g i=invest if !mi(mvalue)
> by company: mydetrend i year, det(i_dtr)
> by company: mydetrend m year, det(mv_dtr)
> areg mv_dtr i_dtr, abs(company)
> reg mvalue c.invest c.year##i.company
>
>
> On Wed, May 9, 2012 at 8:15 PM, ron alfieri <ron.alfieri18@gmail.com> wrote:
>> Thank you Austin! It seems that the differences are due to my panel
>> being unbalanced. Using the prior example you can see that both
>> methods produce different results when dropping some observations to
>> make the panel unbalanced.
>>
>> clear all
>> prog mydetrend, rclass byable(recall)
>> version 10.1
>> syntax varlist [if] [in], DETrend(varname)
>> tempvar eps
>> marksample touse
>> regress `varlist' if `touse'
>> predict double `eps' if e(sample), res
>> replace `detrend' = `eps' if e(sample)
>> end
>>
>> webuse grunfeld
>> replace invest = . in 4
>> replace invest = . in 6
>> replace mvalue = . in 8
>> replace mvalue = . in 13
>> replace invest = . in 6
>> replace invest = . in 7
>> replace invest = . in 11
>> replace invest = . in 15
>> replace invest = . in 21
>>
>> g i_dtr = .
>> g mv_dtr = .
>> by company: mydetrend invest year, det(i_dtr)
>> by company: mydetrend mvalue year, det(mv_dtr)
>> areg mv_dtr invest, abs(company)
>> areg mv_dtr i_dtr, abs(company)
>> reg mvalue c.invest c.year##i.company
>>
>>
>> If you can run the interacted version, e.g.
>> reg mvalue c.invest c.year##i.company
>> in the link cited, why wouldn't you?
>>
>> Because I have too many zip codes to include them all as covariates.
>>
>> Thanks again.
>>
>> On Wed, May 9, 2012 at 4:43 PM, Austin Nichols <austinnichols@gmail.com> wrote:
>>> ron alfieri <ron.alfieri18@gmail.com>:
>>> You don't show what you typed, and it is not clear what you mean by:
>>> "an interaction between the fixed effect for each zip code and a
>>> linear time trend"
>>> --if you mean you interacted a full set of dummies with time, then I
>>> would expect the same point estimates in both.
>>>
>>> Are you neglecting to mention other covariates perhaps?
>>>
>>> If you can run the interacted version, e.g.
>>>  reg mvalue c.invest c.year##i.company
>>> in the link cited, why wouldn't you?
>>>
>>> On Wed, May 9, 2012 at 3:26 PM, ron alfieri <ron.alfieri18@gmail.com> wrote:
>>>> I am trying to estimate a panel data model with a large number of
>>>> unit-specific linear time trends (one for each zip code).
>>>>
>>>> I am using the method proposed here:
>>>>
>>>> http://www.stata.com/statalist/archive/2012-02/msg01108.html
>>>>
>>>> Using a subset of my data, I tried using your method and then compared
>>>> the results to the results from a model where I include zip-code
>>>> specific time trends by adding as covariates an interaction between
>>>> the fixed effect for each zip code and a linear time trend.
>>>>
>>>> The results are very similar, but not identical.
>>>>
>>>> This is how I am interpreting the differences. When de-trending the
>>>> data for one zip-code at a time your code uses only the data points
>>>> from that zip code. However, all data points are used when estimating
>>>> zip-code specific trends by adding as covariates the interactions
>>>> between the fixed effect for each zip code and a linear trend (with
>>>> “all data points” I mean even the data points where these interactions
>>>> take the value of zero that are not used when doing it one zip code at
>>>> a time).
>>>>
>>>> I would appreciate any comments on whether I am interpreting the
>>>> differences between these two methods correctly. If anyone has an
>>>> insight on whether one of the methods is more “appropriate” than the
>>>> other that would be great.
>>>>
>>>> Aaron
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/