Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Programming a slightly complex list of independent variables

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Programming a slightly complex list of independent variables Date Wed, 6 Apr 2011 09:25:04 +0100

```This has come up several times in the last few weeks.  One basic technique is

unab sall : s*
local sall : list sall - x

Nicholas J. Cox.
Stata tip 91: Putting unabbreviated varlists into local macros
Stata Journal 10(3) 503-504

On Wed, Apr 6, 2011 at 5:51 AM, Nic <3817629@hotmail.ca> wrote:

> I am attempting to create a .do file which will run a number of OLS
> regressions containing a single continuous x continuous interaction term.
>
> My ultimate question is: how can I program my regression command so that the
> s* and f* variables at the end of the command refer to all "s" and "f"
> variables EXCEPT for the two specific "s" and "f" (`x' and `z') variables
> referenced at the beginning of the equation?
>
> Here is the applicable code of what I have so far:
>
> -------------------------------------------------------------------------
> foreach y of varlist d* {
> local laby : variable label `y'
>   foreach x of varlist s*  {
>   local labx : variable label `x'
>   local prex = substr("`x'",1,3)
>       foreach z of varlist f* {
>       local labz : variable label `z'
>       local prez = substr("`z'",1,2)
>
>           regress `y' `x' `z' i`prex'`prez' g* c* e* s* f*
> --------------------------------------------------------------------------
>
> As you can see, the inclusion of s* and f* at the end of the equation will
> result in two variables being repeated in the equation: `x' and `z'. The
> consequence is that one instance of the repeated variables is omitted
> because of collinearity.
>
> I would assume that the second instance (s* or f*) of the repeated variable
> in the equation would be the one that is omitted, but this is not always so.
> Sometimes it is the first instance (`x' or `z'). Apparently this is normal
> ("Which variable it omits is somewhat arbitrary") according to the Stata
> FAQ, "Why do estimation commands sometimes omit variables?" located at
> www.stata.com/support/faqs/stat/drop.html.
>
> The consequence of the above is that the location of the values in the e(b)
> and e(V) matrices is unpredictable. This is a problem for me because the
> next step in my .do file is to call upon the first and second independent
> variables listed in the regression command as well as their interaction term
> (to ultimately create a graph):
>
> ----------------------
> matrix b=e(b)
> matrix V=e(V)
>
> scalar b1=b[1,1]
> scalar b2=b[1,2]
> scalar b3=b[1,3]
>
>
> scalar varb1=V[1,1]
> scalar varb2=V[2,2]
> scalar varb3=V[3,3]
>
> scalar covb1b3=V[1,3]
> scalar covb2b3=V[2,3]
> -----------------------
>
> As you can see, when the second instance of the repeated variables is
> omitted, b1/b2/b3 etc refer to a valid cell in the matrix. But when the
> first instance is "somewhat arbitrarily" omitted instead, b1/b2/b3 etc no
> longer refer to the intended cells in the matrix.
>
> So my ultimate question is: how can I program my regression command so that
> the s* and f* variables at the end of the command refer to all "s" and "f"
> variables EXCEPT for the two specific "s" and "f" (`x' and `z') variables
> referenced at the beginning of the equation? Logic tells me that this is
> surely possible but I am still so new to Stata and programming in particular
> that I simply have not been able to suss it out.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```