Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Inefficiency measures greater than one for frontier commands

From	Federico Belotti <[email protected]>
To	[email protected]
Subject	Re: st: Inefficiency measures greater than one for frontier commands
Date	Fri, 3 May 2013 00:35:24 +0200
Dear Kolawole,

my thoughts on your comments below

On Apr 26, 2013, at 5:57 PM, ogundarikolawole wrote:

> Dear All,
> 
> I absolutely agree with comments made on above subject matter. But I have reservation on one or two of these comments. I do not think Kumbhakar and Lovel's definition follow basic intuition. And I believe these authors reframed the definition of the cost efficiency in their book in consistent with definition of technical efficiency. Theoretically speaking  Kumbhakar and Lovel's definition is not correct.

> 
> Let me explain, Cost efficiency is the ratio of observed cost to the optimum cost. How? Well you expect observed cost to be higher than the optimum cost, while you expect the observed output to be lower than the optimum output. In this case, cost efficiency will always be range from 1 to infinity while technical efficiency will always be bounded between 0 and 1. This is basic logic. And this was what was implemented in the STATA and FRONTIER 4.1.
> 
> But to keep the discussion in line with technical efficiency from the production function, it is necessarily to take the inverse of CE which in most cases is equivalent to economic efficiency. This is while Kumbhakar and Lovel's definition revised this definition to fit into the technical efficiency. If you look into the cost function in Kumbhakar and Lovel's book, it is obvious that the correct definition should be observed cost to optimum cost not the other way round.

> 
> For the sake of uniformity of these parameters, it is allow to take the inverse depending on the software are using. While some user written software has been   configure to estimate CE between 0 and 1 that does not mean theoretically, this is true. 

In my view, Kumbhakar and Lovell (2000) reframed the definition of Output Oriented (OO) technical efficiency to be bounded in the unit interval, and not the other way around. Elsewhere in the literature, for instance in the books by Fare and Grosskopf (1994) and Fried, Schmidt and Lovell (2004), OO technical efficiency is defined, relatively to the output set P(x), as the ratio of the maximum feasible output to observed output 

OO_TE = max_{\phi} (\phi * y).

I believe that this is the natural way to define (radially) the OO technical efficiency (following Koopmans, 1951 definition and adapting the Farrell, 1957 definition to the OO case). For such a definition, \phi represents the maximum expansion of the output y given inputs x and is bounded below by unity. Kumbhakar and Lovell (2000) have considered OO_TE = [max_{\phi} (\phi * y)]^{-1}, thus imposing that OO technical efficiency ranges in the unit interval.

On the other hand, a radial Input Oriented (IO) technical efficiency is naturally defined, relatively to the input set L(y), as the ratio of the minimum feasible inputs' bundle to the observed inputs' bundle  

IO_TE = min_{\lambda} (\lambda * x)

where 0 < \lambda <= 1 represents the maximum radial contraction of inputs x which enables to still produce y. This definition is coherent with Farrell (1957).

As for cost efficiency (by construction Input Oriented (IO)), the Kumbhakar and Lovell definition is already (theoretically) coherent with the seminal Farrell (1957) definition which, I believe, one must keep in mind when coding a SF command. Following this definition, IO technical efficiency is necessary but not sufficient for cost (overall) efficiency, where the latter cannot be greater than the former, the difference being the so-called IO allocative (or price) efficiency. In this framework, cost efficiency can only be defined as the ratio of the minimum cost to the observed cost. In my view, this is basic logic.

> 
> The use of CE = E(exp{u}|e) is not a mistake and please one should not confused this with CE = E(exp{-u}|e). While the later fit perfectly for typical production function with (v-u) but the former is fits perfectly for cost function because of the v+u in the error term. The V+U is introduce because we do not expect to have negative observed cost.

On the first point, I fully agree with you that is a matter of perspective. Thus, ensuring the usual regularity conditions still hold, you can definitely define cost efficiency as the ratio of observed cost to the minimum cost. But then, you have to define IO technical efficiency accordingly. Hence, if one defines (naturally) IO technical efficiency as described above, the use of CE = E(exp{u}|e) is not a mistake but is not natural and, most important, it is not coherent with the related definition of IO technical efficiency.

On the compounded error skewness, cost INefficiency enters the cost function with positive sign just because, when it is present, it should augment the observed cost, ceteris paribus. Symmetrically, technical INefficiency enters the production function with negative sign because, when it is present, it should reduce the observed output, ceteris paribus. 

> 
> Theoretically, cost efficiency ranges fro 1 to infinity. But for the sake of uniformity with the definition of technical efficiency, it is important to take the inverse, especially when one is also interested in calculating allocative efficiency. The best way to recall allocative efficiency is to divide the CE also known as EE(economic efficiency) by TE. So, it is impossible to do this unless both the CE (or EE) is express in the same standard as the TE.

Indeed. In my view, one starts by defining IO technical efficiency and then, introducing the behavioral objective of cost minimization and the fact that each producer faces the inputs' prices w, one can define cost efficiency... and this should be done coherently with the definition of IO technical efficiency that one has in mind.

> 
> 
> Finally, a CE of 1.2 means the firm incur cost that is 20% above the frontier cost or optimum cost. The optimum cost here is 1.00 or 100%. This also mean 20% cost inefficiency level as mentioned by one of the contributor.
> 
> However, before I stop, it is very important that the cost efficiency is estimated in a theoretically consistent manner. There is need to impost homogeneity of order one on the prices to prevent negative cost. This is absolutely necessary to prevent problems mentioned by the original author of this questions.
> 
> Regards
> 
> 

My best regards,
Federico

> 
> Dr. Kolawole OGUNDARI
> JSPS Research Fellow
> Laboratory of Agricultural and Farm Management,
> Dept. of Agricultural and Resources Economics, Faculty of Agriculture,
> 
> Kyushu University, Hakozaki 6-10-1, Fukuoka, 812-8581, Japan.
> 
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Federico Belotti [[email protected]]
> Sent: Friday, April 26, 2013 5:11 PM
> To: [email protected]
> Subject: Re: st: Inefficiency measures greater than one for frontier commands
> 
> Dear Aljar,
> 
> the definition used in the Kumbhakar and Lovell's book is the only theoretical definition of cost efficiency. It is always bounded in the unit interval and its empirical counterpart cannot range between 1 and \infty. You can find the same definition at pag.53 of the Coelli, Rao, O'Donnel and Battese book "Introduction to efficiency and productivity analysis", and I think in any other book on this topic. So, my point is that the empirical measure of cost efficiency given by -frontier- and -xtfrontier- is not theoretically coherent and should be corrected. One expects (as Reut expected) that cost efficiency cannot be greater than one.
> 
> Best,
> Federico
> 
> On Apr 26, 2013, at 3:42 PM, Aljar Meesters wrote:
> 
>> Dear Federico,
>> 
>> Thank you for pointing out that the Kumbhakar and Lovell book
>> (Stochastic Frontier Analysis) is using another definition than that
>> is used by Stata, I didn't know that. I think that this stresses the
>> impartance that you at least know the intuition behind a definition.
>> Best,
>> 
>> Aljar
>> 
>> 
>> 2013/4/24 Federico Belotti <[email protected]>:
>>> Dear Aljar and Reut,
>>> 
>>> As reported in the Kumbhakar book "Stochastic Frontier Analysis", cost efficiency is a measure of the ratio between the minimum feasible cost and the observed expenditure. Hence, CE is by construction bounded between 0 and 1. Accordingly, a measure of CE in the SF framework is always provided by
>>> 
>>> CE = exp{-E(u|e)},
>>> 
>>> where E(u|e) is the (post-)estimate of cost inefficiency obtained through the Jondrow et al. (1982) estimator. In the case of a cross-sectional normal-half normal cost frontier, this estimator corresponds to the equation 4.2.12 of Kumbhakar book. Equivalently, another estimator (the estimator implemented in the post estimation command of both -frontier- and -xtfrontier-) can be obtained using the Battese and Coelli (1988) approach
>>> 
>>> CE = E(exp{-u}|e),
>>> 
>>> that it is still bounded in the unit interval (in the case of a cross-sectional normal-half normal cost frontier this estimator is reported in equation 4.2.14 of Kumbhakar book).
>>> 
>>> Thanks to the Reut's post, I realized that both the -frontier- and -xtfrontier- commands show a "strange" behaviour (as well as the FRONTIER 4.1 Fortran routine by Tim Coelli).
>>> Indeed, if you run the following commands
>>> 
>>> webuse frontier2, clear
>>> frontier lncost lnout lnp_l lnp_k, cost d(hn)
>>> predict ce, te
>>> 
>>> you will get point estimates of cost efficiency that range from 1.53 to 1152.92. The same results can be obtained by running a cross-sectional normal-half normal cost frontier using FRONTIER 4.1 on the same data.
>>> 
>>> My guess is that the issue is in the formula implemented behind the post-estimation -frontier- (and -xtfrontier-) command. Indeed, the Stata manual reports for the -frontier- case the following equations
>>> 
>>> CE = normal(-`scost'*sigma1+z)/normal(z) * exp(-`scost'*mu1+1/2*sigma1^2),
>>> 
>>> where
>>>       z = mu1/sigma1,
>>>       mu1 = - `scost'* eps * sigma^2_u / sigma^2,
>>>       sigma1 = sigma_u*sigma_v / sigma^2,
>>> 
>>> with  `scost' = 1 for production and `scost' = -1 for cost frontiers.
>>> 
>>> In my view (and given equation 4.2.14 in Kumbhakar book) the correct formula should be the following
>>> 
>>> CE = normal(-sigma1+z)/normal(z) * exp(-mu1+1/2*sigma1^2).
>>> 
>>> In other words, the only sign change needed to adapt the Battese & Coelli (1988) estimator to the case of cost efficiency is limited to mu1 (since a cost frontier is characterized by a compounded error term with positive skewness, eps = v + u).
>>> 
>>> For some odd reason, both Tim Coelli and Stata developers used CE = E(exp{u}|e) instead of CE = E(exp{-u}|e).
>>> So, a strategy to avoid the problem is to take the reciprocal of what the -frontier- (or -xtfrontier-) command is giving you in order to get approximated Battese & Coelli (1988) point estimates of cost efficiency
>>> 
>>> predict ce, te
>>> replace ce = 1/ce
>>> 
>>> An alternative strategy is to use the Jondrow et al. (1982) approximation through
>>> 
>>> predict u, u
>>> gen ce = exp(-u)
>>> 
>>> hope that helps,
>>> Federico
>>> 
>>> 
>>> 
>>> On Apr 23, 2013, at 11:24 PM, Aljar Meesters wrote:
>>> 
>>>> Your understanding about - predict var, te - is correct. Your
>>>> conceptual question needs some elaboration. A score of one indeed
>>>> represents a fully efficient bank, you can call this 100% efficient.
>>>> If you find a score of say 1.2 you can say that that particular bank
>>>> makes 20% more costs than a fully efficient bank would make. I think
>>>> you can say that the bank is 20% inefficient. Although opinions on
>>>> this may differ, it is at least clear what the 20% means. If you
>>>> prefer to have a score between zero and one (one is fully efficient),
>>>> you can calculate a new score by one over the old score, yet, in this
>>>> case there is no clear interpretation, as far as I know. So, if you
>>>> find that bank Y has a score of 0.8 after the rescaling and call this
>>>> bank 80% efficient, I don't know what this 80% exactly means. However,
>>>> you do find cost efficiencies in the literature that are scaled
>>>> between zero and one, so, it is not uncommon.  As a side note, if you
>>>> rescale the efficiency score by one over the old score, you will
>>>> ignore Jensen's inequality (E[f(x)] != f(E[x])). Whether you find this
>>>> problematic or not is up to you.
>>>> Best,
>>>> 
>>>> Aljar
>>>> 
>>>> 2013/4/23 Reut Levi <[email protected]>:
>>>>> Thank you!
>>>>> 
>>>>> To clarify and make sure I understand.  The syntax: predict VariableName, te would give me inefficiency scores that range from 1 to infinity (for cost functions), right?
>>>>> 
>>>>> In addition, here is a conceptual question. The frontier represents 100% efficiency. According to the inefficiency scores described above, banks that receive a score of one are 100% present efficient. Therefore, scores above 1 would represent banks that are operating above the cost frontier and therefore less efficient.  Now, how can I interpret those inefficiency scores above one? Is there an accepted way to normalize them to range from 0 to 100%, so I will be able to make a statement such as "bank Y is X% efficient/inefficient"?
>>>>> 
>>>>> Thank you for your help and inputs,
>>>>> Reut
>>>>> 
>>>>> 
>>>>> 
>>>>> ________________________________________
>>>>> From: [email protected] [[email protected]] on behalf of Federico Belotti [[email protected]]
>>>>> Sent: Tuesday, April 23, 2013 12:55 PM
>>>>> To: [email protected]
>>>>> Subject: Re: st: Inefficiency measures greater than one for frontier commands
>>>>> 
>>>>> If you are using the -xtfrontier- command the syntax is
>>>>> 
>>>>> predict te, te
>>>>> 
>>>>> In this way you obtain an estimate of efficiency scores through the Jondrow et al. (1982) formula.
>>>>> 
>>>>> Federico
>>>>> 
>>>>> On Apr 23, 2013, at 5:35 PM, Reut Levi wrote:
>>>>> 
>>>>>> Thank you Federico!
>>>>>> 
>>>>>> Do you happen to know if there is a way to predict efficiency scores in STATA, instead of inefficiency scores?
>>>>>> If there is, can you please specify the command syntax?
>>>>>> If there isn't, how should I go about converting the inefficiency scores predicted to represent efficiency levels?
>>>>>> 
>>>>>> Thank you very much,
>>>>>> Reut
>>>>>> 
>>>>>> ________________________________________
>>>>>> From: [email protected] [[email protected]] on behalf of Federico Belotti [[email protected]]
>>>>>> Sent: Monday, April 22, 2013 5:54 AM
>>>>>> To: [email protected]
>>>>>> Subject: Re: st: Inefficiency measures greater than one for frontier commands
>>>>>> 
>>>>>> Dear Reut,
>>>>>> 
>>>>>> in the stochastic frontier framework, "inefficiency" scores ranges from 0 to infinity, while "efficiency" scores are restricted between 0 and 1 by construction since
>>>>>> 
>>>>>> TE = exp{-E[su|e]}  following  Jondrow et al., 1982,
>>>>>> or,
>>>>>> TE = E{exp(s*u)|e}  following Battese and Coelli, 1988,
>>>>>> 
>>>>>> where s = 1 (s = -1) in the cost frontier (production frontier) case.
>>>>>> 
>>>>>> Hope this helps.
>>>>>> Federico
>>>>>> 
>>>>>> On Apr 21, 2013, at 2:28 AM, Reut Levi wrote:
>>>>>> 
>>>>>>> Dear Statalist members,
>>>>>>> 
>>>>>>> I am using the xtfrontier command to estimate inefficiency levels for the U.S banking industry. My data comprised of information from the FFIEC Call Report for the year 2012. It is a large data set with over 29,000 observations. I broke it down by asset size in order to reduce the number of observation and also because the literature suggests that asset size peer group will produce more appropriate inefficiency measures. After breaking down the dataset, the average number of banks in each peer group data set is 650, with observations for 4 quarters, totaling in 2700 data points. All of my variable are in natural logs.
>>>>>>> 
>>>>>>> I am using the xtfrontier command with the options ti and cost. I then predict the inefficiency measures using predict with the option u, but some of my inefficiency predications are greater than one. How is it possible? The manual says that the inefficiency measures are restricted to be between 0 and 1. Am I doing something wrong? Or what could explain those measures greater than 1?
>>>>>>> 
>>>>>>> I am relatively new to STATA so please take it into consideration in your response.
>>>>>>> Thank you very much,
>>>>>>> Reut
>>>>>>> 
>>>>>>> 
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> 
>>>>>> --
>>>>>> Federico Belotti, PhD
>>>>>> Research Fellow
>>>>>> Centre for Economics and International Studies
>>>>>> University of Rome Tor Vergata
>>>>>> tel/fax: +39 06 7259 5627
>>>>>> e-mail: [email protected]
>>>>>> web: http://www.econometrics.it
>>>>>> 
>>>>>> 
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> 
>>>>> --
>>>>> Federico Belotti, PhD
>>>>> Research Fellow
>>>>> Centre for Economics and International Studies
>>>>> University of Rome Tor Vergata
>>>>> tel/fax: +39 06 7259 5627
>>>>> e-mail: [email protected]
>>>>> web: http://www.econometrics.it
>>>>> 
>>>>> 
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> 
>>>>> 
>>>>> 
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> 
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> --
>>> Federico Belotti, PhD
>>> Research Fellow
>>> Centre for Economics and International Studies
>>> University of Rome Tor Vergata
>>> tel/fax: +39 06 7259 5627
>>> e-mail: [email protected]
>>> web: http://www.econometrics.it
>>> 
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> --
> Federico Belotti, PhD
> Research Fellow
> Centre for Economics and International Studies
> University of Rome Tor Vergata
> tel/fax: +39 06 7259 5627
> e-mail: [email protected]
> web: http://www.econometrics.it
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

-- 
Federico Belotti, PhD
Research Fellow
Centre for Economics and International Studies
University of Rome Tor Vergata
tel/fax: +39 06 7259 5627
e-mail: [email protected]
web: http://www.econometrics.it


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: Re: st: comparing regression coefficients across two models with the same dependent variables
Next by Date: st: simultaneously estimate MNL and Logit
Previous by thread: st: where to find the first step of "xtoverid2, noi robust"?
Next by thread: st: simultaneously estimate MNL and Logit
Index(es):
- Date
- Thread