Stas, Patrick, et al.--
The rationale for using ln(f(x)) instead of ln(1-F) is that I can
write down ln(f(x)) for both the Pareto and lognormal families, and I
can't write down F for the lognormal. For my own purposes, I am
interested in tests of Pareto v. lognormal for income distributions,
for which I think my proposed method works. Patrick Wöhrle Guimarães
wanted an estimate of the parameters of a Pareto distribution, for
which Stas' method might be preferred:
cap ssc install kdens
use http://www2.bc.edu/~gottscha/mobility.dta, clear
g lnc2=ln(c2)
kdens lnc2 [pw=wt], g(fx lnx) norm n(`=_N')
g lnfx=ln(fx)
g ln2x=lnx^2
reg lnfx lnx ln2x, r
di "significant coef on ln2x rejects Pareto"
sort lnx
g F=sum(fx)
keep if F<.
replace F=F/F[_N]
g ln1_F=ln(1-F)
reg ln1_F lnx
di "est Pareto param a is " -_b[lnx]
di "est Pareto param k is " exp(-_b[_cons]/_b[lnx])
nlcom exp(-_b[_cons]/_b[lnx])
On 3/6/07, Stas Kolenikov <skolenik@gmail.com> wrote:
> On 3/6/07, Austin Nichols <austinnichols@gmail.com> wrote:
> > The Pareto distribution is typically defined by the cdf F(x;a) = 1 -
> > x^(-a) where a>0 for x>=0 and zero elsewhere, and the pdf f(x;a) =
> > ax^(-a-1) for x>=0 and zero elsewhere. A version with two parameters
> > is given by F(x;a,k) = 1-(x/k)^(-a) and f(x; a,k) = (a/k)(x/k)^(-a-1)
> > = a(k)^(a)(x)^(-a-1).
>
> well I guess it would be easier to take ln(1-F) = -a ln x + a ln k
> which is directly estimable by the standard linear regression...
> possibly with heteroskedastic standard errors if one wished :)). Note
> that the regularity conditions are not satisfied for k, so its
> estimate is likely to be quirky. Add ln^2 x if you wish to that
> regression to test for Pareto-ness of the distribution.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/