Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Demeaning TFP

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Demeaning TFP Date Fri, 12 Oct 2012 08:11:35 +0100

```This seems to be pitched at the perhaps 5 (very wild guess) out of
5000 or so Statalist members who know exactly what this is, but they
are keeping quiet. So I will try a couple of wild guesses.

-levpet- is from SJ.

I gather that there is a quantity TFP which is exp(ln TFP)) and that
for good or at least conventional reasons you are working with that on
a log scale.

Different things seem to be going on here in different tails.

The very large negative values are the analysis's only way to tell you
that TFP is approximately zero in some cases, except that there is
noise too (otherwise TFP would be impossible to estimate).

The larger positive values are to me more worrying. Whatever the
measurement units of TFP, ln TFP of 20 means that TFP is exp(20) or
about 5 billion. Is that a possible value? It seems more likely to me
that you are picking up artefacts of some kind, e.g. that there is a
very small data subset, something is nearly constant so that you are
dividing by a very small number, or something more complicated but
leading to inflated values. How many parameters are you estimating
overall and are there firms in which the number of data values is only
just larger?

Are there yet further firms in which there is no estimate at all
(missing values)?

So, the strategic question is definitely not how to move forwards. You
need diagnostic work to go back and find out what is going on. More
exploratory data analysis might reveal that it is futile to expect
this model to fit well to every individual firm and that you need
criteria to filter out firms not suitable.

Nick

On Wed, Oct 10, 2012 at 11:01 PM, SAM MCCAW <sam2stata@gmail.com> wrote:
> Also, would it be better just to center lnTFP by using:
>
>      bys country_no industry_no: center lnTFP, gen (lnTFP_pooled_centered) ?
>
> Thanks again.
>
>
> On Wed, Oct 10, 2012 at 5:53 PM, SAM MCCAW <sam2stata@gmail.com> wrote:
>> Dear All,
>>
>> I am trying to assess productivity spillovers from foreign to domestic
>> firms and am using levpet as my productivity estimation technique.
>>
>> After I run levpet (both on pooled sample and then industry by
>> industry) I get some gigantic values for lnTFP.
>>
>> My sample for running levpet includes both domestic and foreign firms,
>> i.e. I am not running levpet for domestic firms only.
>>
>> The summary stats for lnTFP are as follows:
>>
>>     Variable |             Obs        Mean    Std. Dev.       Min        Max
>> -------------+--------------------------------------------------------
>> lnTFP_pooled   |     21923   -1247.003    106482.5  -1.43e+07   17.18935
>> lnTFP_industry |     18710    -5587274    3.56e+08  -4.57e+10   24.17001
>>
>> Now I would like to demean TFP but since I have so many really high
>> negative values, would demeaning lnTFP by country and industry just
>> make it worse?
>>
>> Also, I am afraid to restrict the sample to a certain level of lnTFP,
>> for example as far as - 100.
>>
>> Preferably I would just restrict the sample to exclude outliers but I
>> have about 3000 firms which have lnTFP less than -50 in a total sample
>> of 21923.
>>
>> Also, if I go ahead and use these lnTFP values in my regression, by
>> elasticity magnitudes turn up to be 4 digits long. Yikes.
>>
>> I have tried data transformations and so on but am at a loss. Any
>> thoughts would be greatly appreciated.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```