Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to get mean coefficients and t-statistics from several regressions
From 
 
Richard Herron <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: How to get mean coefficients and t-statistics from several regressions 
Date 
 
Tue, 9 Jul 2013 07:00:25 -0400 
Correction, the correct spelling is Petersen (not Peterson).
On Tue, Jul 9, 2013 at 6:57 AM, Richard Herron
<[email protected]> wrote:
> Peterson addresses some programming aspects on his [Website][1], which
> is a companion to his 2009 RFS paper. I think -ivreg2- from SSC also
> does two-way clustering (-ssc install ivreg2-).
>
> Angrist and Pischke (2008) recommend at least 42 clusters, so with 19
> years the cure may be worse than the disease. Fama tends to adjust his
> rejection threshold rather than correct standard errors and usually
> provides a clear, concise discussion of his logic. I recall Fama and
> French (1998) set the rejection threshold at t=3 and Fama and French
> (2002) set the rejection threshold at t=5.
>
> The identification strategy is also important. Using -reg- gives you
> the pooled-panel estimator, while -xtreg, fe- gives you the within
> estimator (i.e., identification using within firm variation). The
> Fama-MacBeth regression identifies using cross-sectional variation,
> then takes the time-series average.
>
> The approach in your paper (Fama-MacBeth by industry) is akin to
> industry fixed effects, although not exactly because it allows all
> coefficients to vary by industry, not just the intercept. I'm not
> familiar with your literature and can't say which is the correct
> specification. There may be a key paper in this literature that
> justifies this approach over firm fixed effects and how they correct
> standard errors for between industry correlation (or if it's
> necessary).
>
> Angrist, J.D., Pischke, J.-S., 2008. Mostly Harmless Econometrics: An
> Empiricist's Companion. Princeton University Press.
>
> Fama, E.F., French, K.R., 1998. Taxes, financing decisions, and firm
> value. The Journal of Finance 53, 819–843.
>
> Fama, E.F., French, K.R., 2002. Testing trade-off and pecking order
> predictions about dividends and debt. Review of financial studies 15,
> 1–33.
>
> [1]:http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/se_programming.htm
>
> On Mon, Jul 8, 2013 at 10:49 AM, Nahla Betelmal <[email protected]> wrote:
>> Hi Richard, I have few questions and I would be grateful if you can
>> help me please. I read the two references and also
>>
>> Samuel B.Thompson, 2010, Simple formulas for standard errors that
>> cluster by both firm and time,Journal of Financial Economics.
>>
>> 1- what is the difference (in terms of Stata commands) between time
>> clustering  and Fama-MacBeth time effect? Petersen (2005) reports both
>> in table 6. Unfortunately, he did not stated the commands he used to
>> drive the results.
>>
>> lets assume that there is a "year" variable in the database, then :
>>
>> statsby _b e(r2), by(year): regress price weight         "does this
>> represent Fama-MacBeth time effect"
>>
>> xtreg  price weight year1 year2... yeark, fe cluster (year)
>> reg  price weight year1 year2... yeark, cluster (year)
>> Which one of these two if any represents what Petersen reported in
>> table 6, column III as  cluster by time) please note that Petersen
>> included time dummies in columns I-IV
>>
>> regress price weight  , cluster (year)   " According to Thompson, 2010
>> footnote in page 4 , this is the cluster by time command.
>>
>> 2- According to Thompson, 2010 we can account for both time and firm
>> effects, however, we need a minimum 25 observations in both
>> dimensions. In my case I have 57 sectors but only 19 years. So I can
>> not follow Thomson double clustering.
>>
>> Again, Petersen was not clear about the double clustering he
>> performed. In the text page 23. He said to account for one dimension
>> (time) as dummies while cluster by the other dimension (firm).
>> However, the results are confusing in Table 6.
>>
>> Column II should represents Firm cluster , however, it includes time
>> dummies. Column IV represents Firm and time cluster which also
>> includes time dummies! What is the difference between column II and
>> column IV?
>>
>> What is the Stata command I can use to account for both time and firm effects?
>>
>>
>> I would highly appreciate it if you help me clear things out. Thank
>> you for your time and help.
>>
>> Regards
>>
>> Nahla
>>
>>
>>
>> On 5 July 2013 17:27, Nahla Betelmal <[email protected]> wrote:
>>> Yes, this is exactly what I meant. Thank you Richard. Especially for
>>> the note about time correlation and the great references. Thank you so
>>> much.
>>>
>>> Best Regards
>>>
>>> Nahla
>>>
>>>
>>> On 5 July 2013 15:27, Richard Herron <[email protected]> wrote:
>>>> I think you want the mean beta across industries and the t-stat based
>>>> on the associated SE.
>>>>
>>>> * begin code
>>>> sysuse auto, clear
>>>> statsby _b e(r2), by(rep78): regress price weight
>>>>
>>>> * get mean betas and R2
>>>> collapse (mean) _b_cons _b_weight _eq2_stat_1 ///
>>>> (semean) _se_cons = _b_cons _se_weight = _b_weight
>>>>
>>>> * get t-stat for mean betas
>>>> foreach v in cons weight {
>>>> generate _t_`v' = _b_`v' / _se_`v'
>>>> }
>>>> list
>>>> * end code
>>>>
>>>> This is a different take on Fama and MacBeth (1973), who do
>>>> cross-sectional regressions each month/year then take the time series
>>>> mean and SE of the regression coefficients.
>>>>
>>>> This works because in asset pricing the time series correlation is low
>>>> (i.e., random walk). Here there may be correlation between the
>>>> industries, which this technique doesn't correct and could bias down
>>>> the SEs (they could address this in the paper - I didn't read).
>>>>
>>>> Mitchell Peterson (2009) provides a great summary of ways to address
>>>> panel data in finance research.
>>>>
>>>> Fama, E.F., MacBeth, J.D., 1973. Risk, return, and equilibrium:
>>>> Empirical tests. The Journal of Political Economy 607–636.
>>>>
>>>> Petersen, M.A., 2009. Estimating standard errors in finance panel data
>>>> sets: Comparing approaches. Review of financial studies 22, 435–480.
>>>>
>>>> On Fri, Jul 5, 2013 at 9:56 AM, Nahla Betelmal <[email protected]> wrote:
>>>>> Thank you, I will keep looking and searching and will let you know if
>>>>> I find how to it (both statistically and command wise).
>>>>> Many thanks again, I highly appreciate it
>>>>>
>>>>> Nahla
>>>>>
>>>>> On 5 July 2013 14:48, Maarten Buis <[email protected]> wrote:
>>>>>> I agree that the mean t-statistic is not very useful. I just
>>>>>> interpreted your initial question as that you wanted to know that, so
>>>>>> I gave it to you. Also, look at the dataset that -statsby- created. If
>>>>>> you find the formula the author used, you in all likelihood want to
>>>>>> use that dataset to do the manipulations.
>>>>>>
>>>>>> -- Maarten
>>>>>>
>>>>>> On Fri, Jul 5, 2013 at 3:38 PM, Nahla Betelmal <[email protected]> wrote:
>>>>>>> Thanks again. This is one of the pioneer papers in the field if not
>>>>>>> the first. Again thanks for the mathematics you gave me. But I do
>>>>>>> believe that it is not the right way "statistically" to get the
>>>>>>> matched t-statistics (can not be the mathematical mean of
>>>>>>> t-statistics) . I will keep looking in other statistical references
>>>>>>> how to do it, and I will search other Stata sources for the Stata
>>>>>>> command, there must be one! The paper mentions that the authors used
>>>>>>> SAS.
>>>>>>>
>>>>>>> Thank you again, I am very grateful for your time and try to help.
>>>>>>> Very kind of you
>>>>>>>
>>>>>>> Nahla
>>>>>>>
>>>>>>> On 5 July 2013 14:26, Maarten Buis <[email protected]> wrote:
>>>>>>>> I would start with understanding the statistics before worying about
>>>>>>>> how to program it. I have only briefly looked at the paper, but I am
>>>>>>>> suspicious about its value. I might be wrong. Anyhow, what I have
>>>>>>>> given you is a way to create a dataset that contains the different
>>>>>>>> pieces of information from each regression. It is now up to you to
>>>>>>>> find a meaningful way to use those bits.
>>>>>>>>
>>>>>>>> -- Maarten
>>>>>>>>
>>>>>>>> On Fri, Jul 5, 2013 at 3:00 PM, Nahla Betelmal <[email protected]> wrote:
>>>>>>>>> Dear Maarten,
>>>>>>>>> Thanks for the reply, but I do not think that I misunderstood the
>>>>>>>>> articles. Kindly have a look at Table 3 and its notes, page 44 in the
>>>>>>>>> following link.
>>>>>>>>>
>>>>>>>>> http://econ.au.dk/fileadmin/Economics_Business/Education/Summer_University_2012/6308_Advanced_Financial_Accounting/Advanced_Financial_Accounting/7/Dechow_Dichev_TAR_2002.pdf
>>>>>>>>>
>>>>>>>>> Also, I have humble knowledge in statistic, according to what I know
>>>>>>>>> that we can have mean coefficients and R2, but it is wrong to attach
>>>>>>>>> the mean coefficient with mean  t-statistics (and hence standard
>>>>>>>>> error). (we can do it mathematically but it is wrong conceptually)
>>>>>>>>>
>>>>>>>>> For example we can not say that the t statistics for B1+B2 is
>>>>>>>>> t-statistic(B1) + t-statistics(B2).
>>>>>>>>>
>>>>>>>>>  It needs to be derived from the distribution of the coefficients.
>>>>>>>>> Unfortunately I do not know how to do it.
>>>>>>>>>
>>>>>>>>> I would highly appreciate any help in that
>>>>>>>>>
>>>>>>>>> Thank you again
>>>>>>>>>
>>>>>>>>> Nahla
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5 July 2013 13:39, Maarten Buis <[email protected]> wrote:
>>>>>>>>>> On Fri, Jul 5, 2013 at 2:24 PM, Nahla Betelmal wrote:
>>>>>>>>>>> My data represents 100 industries  across certain time horizon. It
>>>>>>>>>>> seems from the literature that a regression is run for each industry
>>>>>>>>>>> (i.e. 100 regressions are run), however, only the mean coefficients,
>>>>>>>>>>> mean R-square, and t statistic based on the distribution of 100
>>>>>>>>>>> coefficients for each variable obtained from 100 regressions are
>>>>>>>>>>> reported.
>>>>>>>>>>>
>>>>>>>>>>> I can run the 100 regression in a loop, however, I do not know how can
>>>>>>>>>>> I get  the mean coefficients, the mean R-square, and  t statistic
>>>>>>>>>>> based on the distribution of several coefficients for each variable
>>>>>>>>>>> obtained from several regressions?
>>>>>>>>>>
>>>>>>>>>> I strongly suspect that you misunderstood what was done in those
>>>>>>>>>> articles, but you can do what you ask:
>>>>>>>>>>
>>>>>>>>>> *------------------ begin example ------------------
>>>>>>>>>> sysuse auto, clear
>>>>>>>>>> statsby _b _se e(r2), by(foreign): regress mpg gear turn
>>>>>>>>>>
>>>>>>>>>> // average coefficient for turn
>>>>>>>>>> sum _b_turn
>>>>>>>>>>
>>>>>>>>>> // average t-value for turn
>>>>>>>>>> gen t_turn = _b_turn / _se_turn
>>>>>>>>>> sum t_turn
>>>>>>>>>>
>>>>>>>>>> // average R2
>>>>>>>>>> sum _eq2_stat_1
>>>>>>>>>> *------------------- end example -------------------
>>>>>>>>>> * (For more on examples I sent to the Statalist see:
>>>>>>>>>> * http://www.maartenbuis.nl/example_faq )
>>>>>>>>>>
>>>>>>>>>> ---------------------------------
>>>>>>>>>> Maarten L. Buis
>>>>>>>>>> WZB
>>>>>>>>>> Reichpietschufer 50
>>>>>>>>>> 10785 Berlin
>>>>>>>>>> Germany
>>>>>>>>>>
>>>>>>>>>> http://www.maartenbuis.nl
>>>>>>>>>> ---------------------------------
>>>>>>>>>> *
>>>>>>>>>> *   For searches and help try:
>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>> *
>>>>>>>>> *   For searches and help try:
>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ---------------------------------
>>>>>>>> Maarten L. Buis
>>>>>>>> WZB
>>>>>>>> Reichpietschufer 50
>>>>>>>> 10785 Berlin
>>>>>>>> Germany
>>>>>>>>
>>>>>>>> http://www.maartenbuis.nl
>>>>>>>> ---------------------------------
>>>>>>>> *
>>>>>>>> *   For searches and help try:
>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ---------------------------------
>>>>>> Maarten L. Buis
>>>>>> WZB
>>>>>> Reichpietschufer 50
>>>>>> 10785 Berlin
>>>>>> Germany
>>>>>>
>>>>>> http://www.maartenbuis.nl
>>>>>> ---------------------------------
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/