.- help for ^smfit^ (STB-48: sg106) .- Fitting a Singh-Maddala distribution by ML to unit record data ---------------------------------------------------------------- ^smfit^ incvar [weight] [^if^ ] [^in^ ] [^,^ ^s^tats ^cdf(^cdfname^)^ ^pdf(^pdfname^)^ ^le^vel^(^#^)^ ^nolog^ ^tr^ace ^a0(^#^)^ ^b0(^#^)^ ^q0(^#^)^ ] ^fweight^s and ^aweight^s are allowed; see help @weights@. To reset problem-size limits, see help @matsize@. Description ----------- ^smfit^ fits by ML the 3 parameter Singh-Maddala (1976) distribution to a distribution of a random variable incvar, where observations are available on incvar for a sample of income units. Otherwise known as the Burr Type 12 distribution, the Singh-Maddala distribution has been shown to provide a good fit to empirical income data relative to other parametric functional forms: see e.g. McDonald (1984). It is closely related to the Dagum (Burr Type 3) distribution (Dagum, 1977,1980). For derivation of Lorenz orderings of pairs of income distributions in terms of their Singh-Maddala parameters, see Wifling and Kraemer (1993) and Kleiber (1996). Of course the Singh-Maddala distribution might be suitable for describing any skewed variable, not only income. Options ------- ^stats^ displays selected distributional statistics implied by the Singh-Maddala parameter estimates: percentiles, cumulative shares of total income at percentiles (i.e. the Lorenz curve ordinates), the mean, standard deviation, variance, half the coefficient of variation squared, Gini coefficient, and percentile ratios p90/p10, p75/p25. ^cdf(^cdfname^)^ creates a new variable cdfname containing the estimated Singh-Maddala c.d.f. value F(x) for each x. ^pdf(^pdfname^)^ creates a new variable pdfname containing the estimated Singh-Maddala p.d.f. value f(x) for each x. ^level(^#^)^ specifies the significance level, in percent, for confidence intervals of the parameters; see help @level@. ^nolog^ suppresses the iteration logs. ^trace^ reports the current value of the estimated parameters at each iteration. See [R] maximize. ^a0(^#^)^, ^b0(^#^)^, ^q0(^#^)^ allow the user to specify starting values for the Singh-Maddala parameters. Default starting values are a=2, q=2, and b = arithmetic mean of incvar. Saved results ------------- The global macros set by -ml post-, plus S_a, S_b, S_q: estimated parameters a, b, q, respectively. Access to estimated coefficients (transformations of the parameters) and their s.e.s is available in the usual way: see [U] 20.5 Accessing coefficients, and [R] matrix get. Formulas -------- The Singh-Maddala distribution has distribution function (c.d.f.) F(x) = 1 - { 1/[ 1 + (x/b)^^a ]^^q } where a >= 0, b >= 0, q > 1/a are parameters, for random variable x >= 0 ('income'). Parameters a and q are the key distributional 'shape' parameters; b is a scale parameter. Letting z = 1 + (x/b)^^a, then F(x) = 1 - [ 1/(z^^q], and the probability density function (p.d.f.) is f(x) = (aq/b).{z^^-(q+1)}.[(x/b)^^(a-1)]. The likelihood function for a sample of incomes is specified as the product of the densities for each person (weighted where relevant), and is maximized using Stata's ^deriv0^ (numerical derivatives) method. Transformations of the 3 parameters are estimated (to impose the necessary restrictions) and the parameters derived from these. The formulae used to derive the distributional summary statistics presented (optionally) are as follows. The r-th moment about the origin is given by b^^r*B(1+h/a,q-r/a)/B(1,q) where B(u,v) is the Beta distribution = G(u).G(v)/G(u+v) and G(.) is the gamma function [exp(lngamma(.)], which by substitution and using G(1) = 1, implies the moments can be written b^^r*G(1+r/a)*G(q-r/a)/G(q) and hence mean = b*G(1+1/a)*G(q-1/a)/G(q) variance = b*b*G(1+2/a)*G(q-2/a)/G(q) - (mean^^2) from which the standard deviation and half the squared coefficient of variation can be derived. The percentiles are derived by inverting the distribution function: x_p = b*((1-p)^^(-1/q) - 1)^^(1/a) for each p = F(x_p). The Gini coefficient of inequality is given by 1-Gini = G(q)*G(2q - 1/a) / { G(q-1/a)*G(2q) }. The Lorenz curve ordinates at each p = F(x_p) use the Beta cdf: L(p) = ibeta(1+1/a, q- 1/a, 1-(1-p)^^(1/q) ). Examples -------- . ^smfit x [w=wgt]^ . ^smfit^ . ^smfit x if x>0, s^ Author ------ Stephen P. Jenkins Institute for Social and Economic Research University of Essex, Colchester CO4 3SQ, U.K. stephenj@@essex.ac.uk Advice from Statacorp Technical Support is gratefully acknowledged. References ---------- Dagum, C. (1977) 'A new model of personal income distribution: specification and estimation', Economie Appliquee', 30, 413-437. Dagum, C. (1980) 'The generation and distribution of income, the Lorenz curve and the Gini ratio" Economie Appliquee', 33, 327-367. Kleiber, C. (1996) 'Dagum vs. Singh-Maddala income distributions', Economics Letters, 53, 265-268. McDonald, J.B. (1984) 'Some generalized functions for the size distribution of income', Econometrica, 52, 647-663. Singh, S.K. and G.S. Maddala (1976) 'A function for the size distribution of income', Econometrica, 44, 963-970. Wifling, B. and W. Kraemer (1993) 'The Lorenz-ordering of Singh- Maddala income distributions', Economics Letters, 43, 53-57. Also see -------- STB: STB-48 sg106 On-line: help for @psm@, @qsm@, @dagumfit@ (if installed)