Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Pareto v. lognormal

From   "Stas Kolenikov" <>
Subject   Re: st: Pareto v. lognormal
Date   Tue, 6 Mar 2007 13:22:26 -0600

On 3/6/07, Austin Nichols <> wrote:
> The Pareto distribution is typically defined by the cdf F(x;a) = 1 -
> x^(-a) where a>0 for x>=0 and zero elsewhere, and the pdf f(x;a) =
> ax^(-a-1) for x>=0 and zero elsewhere.  A version with two parameters
> is given by F(x;a,k) = 1-(x/k)^(-a) and f(x; a,k) = (a/k)(x/k)^(-a-1)
> = a(k)^(a)(x)^(-a-1).
> On a log-log plot, the density function for the Pareto distribution is
> a straight line:
> ln f(x) = (−a − 1) ln x + a ln k + ln a.
> This suggests a means for estimating parameters a and k by
> constructing kernel density estimates of f(x), and regressing
> ln(\hat{f(x)}) on ln(x).  Standard errors could presumably be obtained
> via bootstrap.

well I guess it would be easier to take ln(1-F) = -a ln x + a ln k
which is directly estimable by the standard linear regression...
possibly with heteroskedastic standard errors if one wished :)). Note
that the regularity conditions are not satisfied for k, so its
estimate is likely to be quirky. Add ln^2 x if you wish to that
regression to test for Pareto-ness of the distribution.

To test for log-normality, you can construct a Q-Q plot in logs and
see if it conforms to the normal distribution.

Stas Kolenikov
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index