Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: discrete distribution for unobserved heterogeneity

From   Melaku Fekadu <>
Subject   st: discrete distribution for unobserved heterogeneity
Date   Wed, 25 Jul 2012 18:19:32 +0300

Dear Statalisters,

I have a question which is not exactly stata related. It is a modeling
question and I appreciate any help.

I want to model unobserved heterogeneity in a structural framework.
I have two equations in the model and the unobserved heterogeneity
factor is treated in the intercept term in each equation.

Eq1i = vi + a*Xi + e1
Eq2i = wi + b*Xi + e2

i represents individual, X are individual characteristics, a and b are
coefficients, e1 and e2 are iid shocks.
I have 800 individuals in my data, so I do not want to estimate
intercept for each individual. Instead I want to assume that there are
m types of individuals. Each type has a pair of (v,w); type 1 with
(v1,w1), type 2 with (v2,w2) ... and type m with (vm,wm). Apriori I do
not have any assumption about the size of m.

As a result of the estimation process, I want any person in my data to
belong to one of the types, so that I will be able to calculate
correlations between unobserved and observed individual
characteristics. This will also enable me to know the share of each
type in my data.

I have come across the Heckman and Singer (1984) method (where they
use mass points and probabilities) but I could not understand it to
implement it.
I found several papers (see paper 2 in the reference) that use this
approach to estimate a probability for each individual to belong to
type m using a logistic transform. For four types of individuals:

pm = exp(qm)/sum(exp(qr)), when r goes from 1 to 4 and q4 is
normalized to be 0.

But I am not able to understand how this method assigns individuals
into types. I will appreciate any help to make this point clear.
If I understand how it works I will also want to implement it. To do
so I need the q's, v's and w's to be estimated among other parameters
of the model. In the estimation process (iterations) the changes in
these parameters will change the share of the types in my sample. I
will appreciate any guidance on the procedure (in terms of algorithm)
to be followed with respect to the unobserved heterogeneity; it will
help me write the code for estimation. For example should I, as a
first step, divide the sample randomly to equally four types and then
shift people from one type to another based on the values of the

I appreciate any help.



1. Heckman, J.J. and B. Singer, “A Method for Minimizing the Impact of
Distributional Assumptions in Econometric
Models for Duration Data,” Econometrica, 1984, 52 (2), 271–320.

2. Mussida, C. & Picchio, M., 2011. "The Trend over Time of the
GenderWage Gap in Italy," Discussion Paper 2011-093, Tilburg
University, Center for Economic Research.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index