Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Memory requirements for factor variables

From   Partha Deb <>
Subject   Re: st: Memory requirements for factor variables
Date   Mon, 03 May 2010 09:23:22 -0400

Federico - that is definitely a solution I hadn't thought of. But, I do worry that the "simple" formula for the OLS estimate may not be optimal given the size of the dataset and potential scaling issues. I'm still holding out for a slick answer from the Stata gurus, but I might end up using yours. Thanks.


Federico Belotti wrote:

I think there is no way to do that in stata. An alternative could be mata. Clearly, you have to write down the ado for your econometric model. An example using OLS is below.



******  do *******
clear all
set mem 10m
set more off

set seed 123456

set obs 100000

real matrix factor_reg(rows,cols,d1,d2,d3,d4,x,y) {

	D = J(rows,cols,0)
	for(i=1;i<=cols;i++) {
		for(j=1;j<=rows;j++) {
			if (d1[j]==i | d2[j]==i | d3[j]==i | d4[j]==i) D[j,i]=1
	X = x,D,J(100000,1,1)
	Y = y
	beta = invsym(X'X)*(X'Y)

gen x = rnormal()
gen u = rnormal()
gen int d = int(_n/1000)
gen int d1 = int(_n/1100)
gen int d2 = int(_n/1200)
gen int d3 = int(_n/1300)
gen int d4 = int(_n/1400)


gen y = x + u


regress y x i.d

sum d

mata: factor_reg(100000,100,d1,d2,d3,d4,x,y)

forvalues i=1/`r(max)' {

gen byte Id`i' = (d1==`i' | d2==`i' | d3==`i' | d4==`i')


regress y x Id*


Partha Deb
Professor of Economics
Hunter College
ph:  (212) 772-5435
fax: (212) 772-5398

Emancipate yourselves from mental slavery
None but ourselves can free our minds.
	- Bob Marley

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index