Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Murphy-Topel

From   "James Hardin" <>
To   <>
Subject   st: RE: Murphy-Topel
Date   Thu, 19 Sep 2002 08:36:48 -0500

NOTE:  In order for you to follow the questions posed and the 
       answers given, you must have access to the journal article 
       in question.

Giacomo <> writes:
> I have two questions regarding James Hardin's article on the
> Murphy-Topel estimator in the latest issue of the Stata
> Journal:
> 1) with respect to the Sandwich estimator, I am unclear on
> how the matrix Cs2 is computed. In particular, I would
> appreciate if you could clarify how the relevant columns are
> identified based on the variables included in the equations
> (see page 260);

I interpret this question as computational rather than theoretical,
so I assume that the formula itself is not the problem.  If I 
am mistaken, just post a followup.

Generally speaking, I prefer to use summation notation to 
show the calculation of each element of a covariance matrix.
However, when using Stata you are much better off looking
at matrix notation.  The reason is the power of the 
-matrix accum- and -matrix vecaccum- commands.  If we focused
on summation notation, we would end up trying to code
long loops calculating each element.  This is terribly

The equation in question describes one of 4 pieces of a 
partitioned matrix.  All that is needed to get that submatrix
is the code fragment (6 lines) at the bottom of page 260.

Below, I will identify in parentheses the code identifying 
each of the 6 lines.

The equation is actually the sum of two terms.  The 
first term looks like:

    W' Diag(...) X

where in the particular example, we have

    X = [age income ownrent selfemp _cons]
    W = [age income expend  zhat    _cons]

To calculate this term, we generate

    cons = 1

so that we can specify it twice in the -matrix accum- command.
We are going to end up calculating much more than we need,
but that is OK.  We specify (line 1)

  matrix accum Cs1 = age income ownrent selfemp cons /*
      */             age income expend  zhat    cons /*
      */             [iweight=...] , nocons

The first part of the varlist is X, the second part is W,
the diagonal part enters as weights, and we specify nocons
since we explicity included the constants.  The results are

            X' Diag(...) X         X' Diag(...) W
            W' Diag(...) X         W' Diag(...) W

We just keep the lower left part of the result (line 2).

Now that we have the first term, we notice that the second
term looks like

   Diag(...) X

If we put the Diagonal(...) part into a new variable 
dd (line 3), we can then calculate the desired
vector as (line 4):

  matrix vecaccum Cs2 = dd age income ownrent selfemp cons, nocons

There is now one last detail to take care of.  The second 
term (now stored as Cs2) is really only added when we 
are taking derivatives of X after having taken derivatives
for the zhat component of W.  So, the row vector Cs2 is really 
a row of an otherwise zero matrix that is additively
conformable with Cs1.  In other words, we create a 5x5 zero
matrix and then set the fourth row to Cs2; zhat is the fourth
element of W above.  The remaining lines (lines 5 and 6) 
of code perform these manipulations.

So, only 6 lines instead of a double or triple nested loop with
lots of bookkeeping.  Read about the matrix commands.  Look
at the likelihood ado-files for Stata estimation commands for
other examples.  Make -accum- and -vecaccum- your new best friends.

> 2) how would the computation of the Sandwich estimator
> change if I compute robust standard errors by clustering the
> observations in the two main equations?

Probably, I should have included this extra formula.
Nevertheless, there is nothing unusual about this
modification.  Look at the definition of the partitioned B
matrix on page 256.  Introduce a sum for every parenthesized
element where the sums are over indices of the independent

Since there is no cluster-type Murphy-Topel estimator, there
is nothing to which I can compare the cluster sandwich
estimator.  So, there is no discussion of this in the paper.

  -- James

James W. Hardin, Ph.D., Lecturer
Department of Statistics, Blocker 416G   979-845-3141 (phone)
Texas A&M University Mail Stop-3143      979-845-3144 (fax)
College Station, TX 77843-3143 
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index