Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Constructing matrix regressors with thousands of dummy variables

 From Matthew Baker To statalist@hsphsun2.harvard.edu Subject Re: st: Constructing matrix regressors with thousands of dummy variables Date Thu, 11 Apr 2013 09:27:25 -0400

```Dear Ivan --

There are really two parts to your question - getting a large number
of variables into mata, and then using the matrices. The first part
can be handled by creating a local macro. Consider first making 1000
fictional variables:

// Start example
clear*
set obs 1000
forvalues i=1/1000 {
gen a`i'=runiform()
}

// Read into Mata using a local

local vars
forvalues i=1/1000 {
local vars "`vars' a`i'"
}

mata: st_view(X=.,.,"`vars'")

// Just check and see if everything is there!

mata: rows(X),cols(X)

// To compute inverse:

mata: C=invsym(X'X)

// end example

I have, however, found that the transpose/multiplication and inversion
steps can a) take a very, very long time for matrices larger than
5000X5000, and b) also create memory problems. So I don't know if the
"invsym" step will work as you might hope!

Hope that helps,

Matt Baker

On Mon, Apr 8, 2013 at 10:56 PM, Ivan Png <iplpng@gmail.com> wrote:
> Dear Statalist
>
> I am researching the effects of various factors on mobility of
> inventors. The dependent variable, M_it is an indicator = 1 if
> inventor i changed employer in year t, else = 0.  The explanatory
> variables include marital status, education level, and citizenship.  I
> also include inventor, state, and year fixed effects.  Originally, I
> simply estimated a linear probability model by areg with
> absorb(inventor) and explanatory variables comprising married,
> bachelor, citizen, i.state and i.year.
>
> However, the measure of mobility is subject to error.  Obviously, this
> error cannot be classical.  For instance, if the observed M_it = 1 is
> wrong, then the true M_it = 0.  I would like to apply the method of
> Meyer and Mittag, U of Chicago (2012) to characterize the bias due to
> the error.
>
> For this, I need to calculate the conditional expectation of the
> matrix of explanatory variables, X, conditional on error in
> measurement of M_it.  I have two data-sets, one with measurement error
> and one without, so, I can identify the observations with error.
>
> Question:
> How to construct the matrix, X, and the inverse matrix, (X'X)^-1?  The
> online guides teach me how to construct a matrix when there are two or
> three explanatory variables.
>
> . mata
> : st_view(y= , , "mobility")
> : st_view(X= , , "married", "bachelor", "citizen")
>
> But, in my case, I have dummies for 10,000 inventors plus the state
> and year fixed effect.  The above method doesn't seem practical.
>
> Can I run areg and retrieve the X and inverse matrix, (X'X)^-1?
>
>
>  -
> Best wishes
> Ivan Png
> Skype: ipng00
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

--
Dr. Matthew J. Baker
Department of Economics
Hunter College and the Graduate Center, CUNY
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```