# Re: st: Pareto v. lognormal

 From "Stas Kolenikov" To statalist@hsphsun2.harvard.edu Subject Re: st: Pareto v. lognormal Date Tue, 6 Mar 2007 13:22:26 -0600

On 3/6/07, Austin Nichols <austinnichols@gmail.com> wrote:
> The Pareto distribution is typically defined by the cdf F(x;a) = 1 -
> x^(-a) where a>0 for x>=0 and zero elsewhere, and the pdf f(x;a) =
> ax^(-a-1) for x>=0 and zero elsewhere.  A version with two parameters
> is given by F(x;a,k) = 1-(x/k)^(-a) and f(x; a,k) = (a/k)(x/k)^(-a-1)
> = a(k)^(a)(x)^(-a-1).
>
> On a log-log plot, the density function for the Pareto distribution is
> a straight line:
> ln f(x) = (−a − 1) ln x + a ln k + ln a.
>
> This suggests a means for estimating parameters a and k by
> constructing kernel density estimates of f(x), and regressing
> ln(\hat{f(x)}) on ln(x).  Standard errors could presumably be obtained
> via bootstrap.

well I guess it would be easier to take ln(1-F) = -a ln x + a ln k
which is directly estimable by the standard linear regression...
possibly with heteroskedastic standard errors if one wished :)). Note
that the regularity conditions are not satisfied for k, so its
estimate is likely to be quirky. Add ln^2 x if you wish to that
regression to test for Pareto-ness of the distribution.

To test for log-normality, you can construct a Q-Q plot in logs and
see if it conforms to the normal distribution.

--
Stas Kolenikov
http://stas.kolenikov.name
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/