Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Matthew Baker <matthew.baker@hunter.cuny.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Constructing matrix regressors with thousands of dummy variables |

Date |
Thu, 11 Apr 2013 09:27:25 -0400 |

Dear Ivan -- There are really two parts to your question - getting a large number of variables into mata, and then using the matrices. The first part can be handled by creating a local macro. Consider first making 1000 fictional variables: // Start example clear* set obs 1000 forvalues i=1/1000 { gen a`i'=runiform() } // Read into Mata using a local local vars forvalues i=1/1000 { local vars "`vars' a`i'" } mata: st_view(X=.,.,"`vars'") // Just check and see if everything is there! mata: rows(X),cols(X) // To compute inverse: mata: C=invsym(X'X) // end example I have, however, found that the transpose/multiplication and inversion steps can a) take a very, very long time for matrices larger than 5000X5000, and b) also create memory problems. So I don't know if the "invsym" step will work as you might hope! Hope that helps, Matt Baker On Mon, Apr 8, 2013 at 10:56 PM, Ivan Png <iplpng@gmail.com> wrote: > Dear Statalist > > I am researching the effects of various factors on mobility of > inventors. The dependent variable, M_it is an indicator = 1 if > inventor i changed employer in year t, else = 0. The explanatory > variables include marital status, education level, and citizenship. I > also include inventor, state, and year fixed effects. Originally, I > simply estimated a linear probability model by areg with > absorb(inventor) and explanatory variables comprising married, > bachelor, citizen, i.state and i.year. > > However, the measure of mobility is subject to error. Obviously, this > error cannot be classical. For instance, if the observed M_it = 1 is > wrong, then the true M_it = 0. I would like to apply the method of > Meyer and Mittag, U of Chicago (2012) to characterize the bias due to > the error. > > For this, I need to calculate the conditional expectation of the > matrix of explanatory variables, X, conditional on error in > measurement of M_it. I have two data-sets, one with measurement error > and one without, so, I can identify the observations with error. > > Question: > How to construct the matrix, X, and the inverse matrix, (X'X)^-1? The > online guides teach me how to construct a matrix when there are two or > three explanatory variables. > > . mata > : st_view(y= , , "mobility") > : st_view(X= , , "married", "bachelor", "citizen") > > But, in my case, I have dummies for 10,000 inventors plus the state > and year fixed effect. The above method doesn't seem practical. > > Can I run areg and retrieve the X and inverse matrix, (X'X)^-1? > > May I please have your help? > > - > Best wishes > Ivan Png > Skype: ipng00 > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ -- Dr. Matthew J. Baker Department of Economics Hunter College and the Graduate Center, CUNY * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Constructing matrix regressors with thousands of dummy variables***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Constructing matrix regressors with thousands of dummy variables***From:*Ivan Png <iplpng@gmail.com>

- Prev by Date:
**st: RE: ivreg2: endogeneity & AP F test in esttab** - Next by Date:
**Re: st: Constructing matrix regressors with thousands of dummy variables** - Previous by thread:
**st: Constructing matrix regressors with thousands of dummy variables** - Next by thread:
**Re: st: Constructing matrix regressors with thousands of dummy variables** - Index(es):