Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: regression r(103): too many variables


From   Paul Higgins <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: regression r(103): too many variables
Date   Wed, 24 Feb 2010 10:01:20 -0600

Hi all,

Thanks for all of your suggestions: they were a big help.  My code contained an error that is probably a classic newbie misstep: misusing hyphens when making lists of variables.  The rhs of my regression contained thousands of interactions between sets of dummy variables (96 dummies representing quarter-hour time increments interacted with 22 date values of special import for the problem I was investigating, yielding a total of 2112 altogether just for that one pair of variables).  To construct these, I used code of the following form:

/*****************************/
/* generate separate dummies */
/* for each event date       */
/*****************************/

#delimit ;
local eventdates "mdy(1,13,2009)  mdy(2,20,2009)  mdy(3,27,2009)
                  mdy(4,10,2009)  mdy(4,17,2009)  mdy(5,18,2009)
                  mdy(5,23,2009)  mdy(5,24,2009)  mdy(6,30,2009)
                  mdy(7,1,2009)   mdy(7,9,2009)   mdy(8,14,2009)
                  mdy(8,15,2009)  mdy(9,16,2009)  mdy(9,18,2009)
                  mdy(9,19,2009)  mdy(10,3,2009)  mdy(11,2,2009)
                  mdy(11,3,2009)  mdy(12,7,2009)  mdy(12,8,2009)
                  mdy(12,9,2009)";
#delimit cr
local c = 1
foreach x of local eventdates {
      gen byte dum_`c' = (dt==`x')
      local c = `c' + 1
      }

/************************************/
/* interact each event date dummy w/*/
/* each quarter-hour interval dummy */
/************************************/

forvalues x = 1/96 {
        forvalues y = 1/22 {
                gen byte dum_`y'_int_`x' = dum_`y'*int_`x'
                }
        }

Due to the order I used to nest the two loops, the variables weren't created in the same sequence as that assumed by my hyphenated lists in my regress statement.  I am a recent arrival in Stata-world (having been born in SAS-land, and having emigrated here via several other intermediate stops along the way), and in most other stats programs I've worked with, a single hyphen in a list of this type (i.e., dum_1_int_1-dum_1_int_96) would be expanded out in logical sequential fashion (i.e., dum_1_int_1 dum_1_int_2 ...).  But Stata expanded it out in the physical order in which the variables appeared in the data set (i.e., dum_1_int_1 dum_2_int_1 ...).  Thus, my regressions contained far more than 2500 rhs variables -- mostly redundant ones!  Once I replaced the hyphenated lists in the regress statement with wild-card versions (e.g., dum_1_int_*), all was well.

Thanks again for your assitance.

Paul H.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Martin Weiss
Sent: Wednesday, February 24, 2010 1:59 AM
To: [email protected]
Subject: AW: st: RE: regression r(103): too many variables


<> 

Andi may want to use 


*************
des, short
*************

to prevent clutter on his screen.


HTH
Martin

-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von
[email protected]
Gesendet: Mittwoch, 24. Februar 2010 06:13
An: [email protected]
Betreff: Re: st: RE: regression r(103): too many variables

Verify that you actually have 2500 variables, possibly by running
-des- on the variable list.

Steve
--- Paul Higgins
> I am trying to use regress to run a linear regression.  The
> specification has a lot of rhs variables (around 2500), the
> majority of which are binary (0/1) variables.  <snip> I am
> getting r(103), "Too many variables specified".


On Tue, Feb 23, 2010 at 1:08 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
>
> This runs w/o a hitch in Stata 10.1 MP. Takes something like 2 minutes:
>
> *******
> clear*
> set mem 500m
> set obs 13700
>
> foreach var of newlist var1-var2500{
>                gen byte `var'=runiform()<.3
> }
>
> gen y=rnormal()
> reg y var1-var2500
> *******
>
>
> HTH
> Martin
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Paul Higgins
> Sent: Dienstag, 23. Februar 2010 21:28
> To: '[email protected]'
> Subject: st: regression r(103): too many variables
>
> Hi all,
>
> I am trying to use regress to run a linear regression.  The specification
> has a lot of rhs variables (around 2500), the majority of which are binary
> (0/1) variables.  The data set contains about 13700 observations.  At the
> top of the .do file I set mem to 5 gigabytes, maxvar to 10000 and matsize
to
> 10000.  I'm using Stata / SE 10.1 for Windows, under Windows XP
Professional
> x64 edition version 5.2, on a machine that has 8 gigabytes of physical
> memory on-board.  I am getting r(103), "Too many variables specified".
 I've
> poked around the documentation, and I can see no mention of any internal
> limits to the regress command regarding number of variables.  Thus, I have
> assumed that only the general limits for Stata SE apply: maximum of 32767
> variables, maximum matsize of 11000.  But I appear to be wrong.
>
> Suggestions, please?
>
> PaulH
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Steven Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
USA
845-246-0774

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index