Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Programming a slightly complex list of independent variables

From	"Nic" <[email protected]>
To	<[email protected]>
Subject	st: Programming a slightly complex list of independent variables
Date	Wed, 6 Apr 2011 00:51:55 -0400

Hi statalist

I am attempting to create a .do file which will run a number of OLSregressions containing a single continuous x continuous interaction term.

My ultimate question is: how can I program my regression command so that thes* and f* variables at the end of the command refer to all "s" and "f"variables EXCEPT for the two specific "s" and "f" (`x' and `z') variablesreferenced at the beginning of the equation?


Here is the applicable code of what I have so far:

-------------------------------------------------------------------------
foreach y of varlist d* {
local laby : variable label `y'
   foreach x of varlist s*  {
   local labx : variable label `x'
   local prex = substr("`x'",1,3)
       foreach z of varlist f* {
       local labz : variable label `z'
       local prez = substr("`z'",1,2)

           regress `y' `x' `z' i`prex'`prez' g* c* e* s* f*
--------------------------------------------------------------------------

As you can see, the inclusion of s* and f* at the end of the equation willresult in two variables being repeated in the equation: `x' and `z'. Theconsequence is that one instance of the repeated variables is omittedbecause of collinearity.

I would assume that the second instance (s* or f*) of the repeated variablein the equation would be the one that is omitted, but this is not always so.Sometimes it is the first instance (`x' or `z'). Apparently this is normal("Which variable it omits is somewhat arbitrary") according to the StataFAQ, "Why do estimation commands sometimes omit variables?" located atwww.stata.com/support/faqs/stat/drop.html.

The consequence of the above is that the location of the values in the e(b)and e(V) matrices is unpredictable. This is a problem for me because thenext step in my .do file is to call upon the first and second independentvariables listed in the regression command as well as their interaction term(to ultimately create a graph):


----------------------
matrix b=e(b)
matrix V=e(V)

scalar b1=b[1,1]
scalar b2=b[1,2]
scalar b3=b[1,3]


scalar varb1=V[1,1]
scalar varb2=V[2,2]
scalar varb3=V[3,3]

scalar covb1b3=V[1,3]
scalar covb2b3=V[2,3]
-----------------------

As you can see, when the second instance of the repeated variables isomitted, b1/b2/b3 etc refer to a valid cell in the matrix. But when thefirst instance is "somewhat arbitrarily" omitted instead, b1/b2/b3 etc nolonger refer to the intended cells in the matrix.

So my ultimate question is: how can I program my regression command so thatthe s* and f* variables at the end of the command refer to all "s" and "f"variables EXCEPT for the two specific "s" and "f" (`x' and `z') variablesreferenced at the beginning of the equation? Logic tells me that this issurely possible but I am still so new to Stata and programming in particularthat I simply have not been able to suss it out.


With gratitude,

Nic

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Programming a slightly complex list of independent variables
  - From: Nick Cox <[email protected]>

Prev by Date: st: Xtoverid, cluster(clusvar) noisily
Next by Date: Re: st: Xtmixed specification for rmANOVA with 2 within-subject factors
Previous by thread: st: Xtoverid, cluster(clusvar) noisily
Next by thread: Re: st: Programming a slightly complex list of independent variables
Index(es):
- Date
- Thread