.- help for ^dagumfit^ (STB-48: sg106) .- Fitting a Dagum distribution by ML to unit record data ---------------------------------------------------------------- ^dagumfit^ incvar [weight] [^if^ ] [^in^ ] [^,^ ^s^tats ^cdf(^cdfname^)^ ^pdf(^pdfname^)^ ^le^vel^(^#^)^ ^nolog^ ^tr^ace ^b0(^#^)^ ^d0(^#^)^ ^h0(^#^)^ ] ^fweight^s and ^aweight^s are allowed; see help @weights@. To reset problem-size limits, see help @matsize@. Description ----------- ^dagumfit^ fits by ML the 3 parameter Dagum (1977, 1980) distribution to a distribution of a random variable incvar, where observations are available on incvar for a sample of income units. Otherwise known as the Burr Type 3 distribution, the Dagum distribution has been shown to provide a good fit to empirical income data relative to other parametric functional forms. It is closely related to the Singh-Maddala (Burr Type 12) distribution (Singh and Maddala, 1976). For derivation of Lorenz orderings of pairs of income distributions in terms of their Dagum parameters, see Kleiber (1996). Of course the Dagum distribution might be suitable for describing any skewed variable, not only income. Options ------- ^stats^ displays selected distributional statistics implied by the Dagum model parameter estimates: percentiles, cumulative shares of total income at percentiles (i.e. the Lorenz curve ordinates), the mean, standard deviation, variance, half the coefficient of variation squared, Gini coefficient, and percentile ratios p90/p10, p75/p25. ^cdf(^cdfname^)^ creates a new variable cdfname containing the estimated Dagum c.d.f. value F(x) for each x. ^pdf(^pdfname^)^ creates a new variable pdfname containing the estimated Dagum p.d.f. value f(x) for each x. ^level(^#^)^ specifies the significance level, in percent, for confidence intervals of the parameters; see help @level@. ^nolog^ suppresses the iteration logs. ^trace^ reports the current value of the estimated parameters at each iteration. See [R] maximize. ^b0(^#^)^, ^d0(^#^)^, ^h0(^#^)^ allow the user to specify starting values for the Dagum parameters. Default starting values are b=exp(4), d=exp(0.1), and h = 1+exp(13). Saved results ------------- The global macros set by -ml post-, plus S_b, S_d, S_h: estimated parameters b, d, h, respectively. Access to estimated coefficients (transformations of the parameters) and their s.e.s is available in the usual way: see [U] 20.5 Accessing coefficients, and [R] matrix get. Formulas -------- The Dagum distribution has distribution function (c.d.f.) F(x) = [1 + h*x^^(-d)]^^(-b) where b > 0, h > 0, d > 1/b are parameters, for random variable x > 0 ('income'). Parameters b and d are the key distributional 'shape' parameters; h is a scale parameter. The probability density function (p.d.f.) is f(x) = [(b*d*h)*x^^(-d-1)]/[1 + h*x^^(-d)]^^(b+1). The likelihood function for a sample of incomes is specified as the product of the densities for each person (weighted where relevant), and is maximized using Stata's ^deriv0^ (numerical derivatives) method. Transformations of the 3 parameters are estimated (to impose the necessary restrictions) and the parameters derived from these. The formulae used to derive the distributional summary statistics presented (optionally) are as follows. The r-th moment about the origin is given by [b*h^^(r/d)]*B(1-r/d,b+r/d) where B(u,v) is the Beta distribution = G(u).G(v)/G(u+v) and G(.) is the gamma function [exp(lngamma(.)], which by substitution and using G(1) = 1, implies the moments can be written [b*h^^(r/d)]*G(1-r/d)*G(b+r/d)/G(b+1) and hence mean = [b*h^^(1/d)]*G(1-1/d)*G(b+1/d)/G(b+1) variance = [b*h^^(2/d)]*G(1-2/d)*G(b+2/d)/G(b+1) - (mean^^2) from which the standard deviation and half the squared coefficient of variation can be derived. The percentiles are derived by inverting the distribution function: x_p = [h^^(1/d)]*[p^^(-1/b) - 1]^^(-1/d) for each p = F(x_p). The Gini coefficient of inequality is given by 1-Gini = [G(b)*G(2b+1/d)] / [G(2b)*G(b+1/d)]. The Lorenz curve ordinates at each p = F(x_p) use the Beta cdf: L(p) = ibeta(b+1/d, 1-1/d, p^^(1/b)). Examples -------- . ^dagumfit x [w=wgt]^ . ^dagumfit^ . ^dagumfit x if famtype==1, s^ Author ------ Stephen P. Jenkins Institute for Social and Economic Research University of Essex, Colchester CO4 3SQ, U.K. stephenj@@essex.ac.uk Advice from Statacorp Technical Support is gratefully acknowledged. References ---------- Dagum, C. (1977) 'A new model of personal income distribution: specification and estimation', Economie Appliquee', 30, 413-437. Dagum, C. (1980) 'The generation and distribution of income, the Lorenz curve and the Gini ratio" Economie Appliquee', 33, 327-367. Kleiber, C. (1996) 'Dagum vs. Singh-Maddala income distributions', Economics Letters, 53, 265-268. McDonald, J.B. (1984) 'Some generalized functions for the size distribution of income', Econometrica, 52, 647-663. Singh, S.K. and G.S. Maddala (1976) 'A function for the size distribution of income', Econometrica, 44, 963-970. Also see -------- STB: STB-48 sg106 On-line: help for @pdagum@, @qdagum@, @smfit@ (if installed)