# Re: st: Memory requirements for factor variables

 From Partha Deb To statalist@hsphsun2.harvard.edu Subject Re: st: Memory requirements for factor variables Date Mon, 03 May 2010 09:23:22 -0400

Federico - that is definitely a solution I hadn't thought of. But, I do worry that the "simple" formula for the OLS estimate may not be optimal given the size of the dataset and potential scaling issues. I'm still holding out for a slick answer from the Stata gurus, but I might end up using yours. Thanks.
```
Partha

Federico Belotti wrote:
```
```Partha,

I think there is no way to do that in stata. An alternative could be mata. Clearly, you have to write down the ado for your econometric model. An example using OLS is below.

HTH

Federico

******  do *******
clear all
set mem 10m
set more off

set seed 123456

set obs 100000

mata
real matrix factor_reg(rows,cols,d1,d2,d3,d4,x,y) {

D = J(rows,cols,0)
for(i=1;i<=cols;i++) {
for(j=1;j<=rows;j++) {
if (d1[j]==i | d2[j]==i | d3[j]==i | d4[j]==i) D[j,i]=1
}
}
X = x,D,J(100000,1,1)
Y = y
beta = invsym(X'X)*(X'Y)
beta
```
}
```end

gen x = rnormal()
gen u = rnormal()
gen int d = int(_n/1000)
gen int d1 = int(_n/1100)
gen int d2 = int(_n/1200)
gen int d3 = int(_n/1300)
gen int d4 = int(_n/1400)

sum

gen y = x + u

describe,s

regress y x i.d

sum d

tomata
mata: factor_reg(100000,100,d1,d2,d3,d4,x,y)

forvalues i=1/`r(max)' {

gen byte Id`i' = (d1==`i' | d2==`i' | d3==`i' | d4==`i')
}

describe,s

regress y x Id*

exit

```
```
