Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Constructing matrix regressors with thousands of dummy variables

From   Matthew Baker <>
Subject   Re: st: Constructing matrix regressors with thousands of dummy variables
Date   Thu, 11 Apr 2013 09:27:25 -0400

Dear Ivan --

There are really two parts to your question - getting a large number
of variables into mata, and then using the matrices. The first part
can be handled by creating a local macro. Consider first making 1000
fictional variables:

// Start example
set obs 1000
forvalues i=1/1000 {
gen a`i'=runiform()

// Read into Mata using a local

local vars
forvalues i=1/1000 {
local vars "`vars' a`i'"

mata: st_view(X=.,.,"`vars'")

// Just check and see if everything is there!

mata: rows(X),cols(X)

// To compute inverse:

mata: C=invsym(X'X)

// end example

I have, however, found that the transpose/multiplication and inversion
steps can a) take a very, very long time for matrices larger than
5000X5000, and b) also create memory problems. So I don't know if the
"invsym" step will work as you might hope!

Hope that helps,

Matt Baker

On Mon, Apr 8, 2013 at 10:56 PM, Ivan Png <> wrote:
> Dear Statalist
> I am researching the effects of various factors on mobility of
> inventors. The dependent variable, M_it is an indicator = 1 if
> inventor i changed employer in year t, else = 0.  The explanatory
> variables include marital status, education level, and citizenship.  I
> also include inventor, state, and year fixed effects.  Originally, I
> simply estimated a linear probability model by areg with
> absorb(inventor) and explanatory variables comprising married,
> bachelor, citizen, i.state and i.year.
> However, the measure of mobility is subject to error.  Obviously, this
> error cannot be classical.  For instance, if the observed M_it = 1 is
> wrong, then the true M_it = 0.  I would like to apply the method of
> Meyer and Mittag, U of Chicago (2012) to characterize the bias due to
> the error.
> For this, I need to calculate the conditional expectation of the
> matrix of explanatory variables, X, conditional on error in
> measurement of M_it.  I have two data-sets, one with measurement error
> and one without, so, I can identify the observations with error.
> Question:
> How to construct the matrix, X, and the inverse matrix, (X'X)^-1?  The
> online guides teach me how to construct a matrix when there are two or
> three explanatory variables.
> . mata
> : st_view(y= , , "mobility")
> : st_view(X= , , "married", "bachelor", "citizen")
> But, in my case, I have dummies for 10,000 inventors plus the state
> and year fixed effect.  The above method doesn't seem practical.
> Can I run areg and retrieve the X and inverse matrix, (X'X)^-1?
> May I please have your help?
>  -
> Best wishes
> Ivan Png
> Skype: ipng00
> *
> *   For searches and help try:
> *
> *
> *

Dr. Matthew J. Baker
Department of Economics
Hunter College and the Graduate Center, CUNY
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index