Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Partha Deb <partha.deb@hunter.cuny.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Memory requirements for factor variables |

Date |
Mon, 03 May 2010 09:40:12 -0400 |

Austin,

cheers. Partha Austin Nichols wrote:

Partha-- I think you want to model your code on -fese- (ssc desc fese) or -felsdvreg- or -felsdvregdm- (findit felsdvreg). But can you give a more germane example? Do you really mean to create dummies based on an OR condition over 4 categorical variables (testing whether any of the four is a given level)? Do you need estimates for your 500 dummies, or do you just want to partial them out of the regression? The second is much easier than the first. forvalues i=1/100 { gen byte ID`i' = (D1==`i' | D2==`i' | D3==`i' | D4==`i') } On Mon, May 3, 2010 at 9:23 AM, Partha Deb <partha.deb@hunter.cuny.edu> wrote:Federico - that is definitely a solution I hadn't thought of. But, I do worry that the "simple" formula for the OLS estimate may not be optimal given the size of the dataset and potential scaling issues. I'm still holding out for a slick answer from the Stata gurus, but I might end up using yours. Thanks. Partha Federico Belotti wrote:Partha, I think there is no way to do that in stata. An alternative could be mata. Clearly, you have to write down the ado for your econometric model. An example using OLS is below. HTH Federico ****** do ******* clear all set mem 10m set more off set seed 123456 set obs 100000 mata real matrix factor_reg(rows,cols,d1,d2,d3,d4,x,y) { D = J(rows,cols,0) for(i=1;i<=cols;i++) { for(j=1;j<=rows;j++) { if (d1[j]==i | d2[j]==i | d3[j]==i | d4[j]==i) D[j,i]=1 } } X = x,D,J(100000,1,1) Y = y beta = invsym(X'X)*(X'Y) beta } end gen x = rnormal() gen u = rnormal() gen int d = int(_n/1000) gen int d1 = int(_n/1100) gen int d2 = int(_n/1200) gen int d3 = int(_n/1300) gen int d4 = int(_n/1400) sum gen y = x + u describe,s regress y x i.d sum d tomata mata: factor_reg(100000,100,d1,d2,d3,d4,x,y) forvalues i=1/`r(max)' { gen byte Id`i' = (d1==`i' | d2==`i' | d3==`i' | d4==`i') } describe,s regress y x Id* exit* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

-- Partha Deb Professor of Economics Hunter College ph: (212) 772-5435 fax: (212) 772-5398 http://urban.hunter.cuny.edu/~deb/ Emancipate yourselves from mental slavery None but ourselves can free our minds. - Bob Marley * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Memory requirements for factor variables***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: Memory requirements for factor variables***From:*Partha Deb <partha.deb@hunter.cuny.edu>

**Re: st: Memory requirements for factor variables***From:*Federico Belotti <f.belotti@econometrics.it>

**Re: st: Memory requirements for factor variables***From:*Partha Deb <partha.deb@hunter.cuny.edu>

**Re: st: Memory requirements for factor variables***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**Re: st: Memory requirements for factor variables** - Next by Date:
**st: Survival Analysis: Mantel-Byar Test** - Previous by thread:
**Re: st: Memory requirements for factor variables** - Next by thread:
**Re: st: Memory requirements for factor variables** - Index(es):