Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Areg, absorb


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Areg, absorb
Date   Mon, 11 Apr 2011 14:03:16 +0100 (BST)

--- On Mon, 11/4/11, emanuele mazzini wrote:
> do you know a way to not omit the variables that the
> command xi i.varname generates? I tried with the option
> noomit, but it seems that it does not work, i.e. it
> still keeps on omitting the first country of my sample.

Imagine you have two countries Aistan and Bland and that we 
want to predict a variable y. Lets first understand what 
happens when we omit one of the dummies. In this case 
assume we use one dummy variable called bland, which is 1
 when the country is Bland and 0 when it is not Bland (and 
thus Aistan). In that case we ommited the dummy aistan.

In this case  we have the following equation:
y_hat = b0 + b1 * bland 

If the country is Bland than its predicted values is
y_hat = b0 + b1 * 1 = b0 + b1

If the country is Aistan than its predicted value is
y_hat = b0 + b1 * 0 = b0 

So the constant is the predicted y for Aistan and b1
is the difference in predicted y between Aistan and 
Bland.

What will happen when we also include the dummy aistan?
In this case  we have the following equation:
y_hat = b0 + b1 * bland + b2 * aistan

If the country is Bland than its predicted values is
y_hat = b0 + b1 * 1 + b2 * 0 = b0 + b1

If the country is Aistan than its predicted value is
y_hat = b0 + b1 * 0 + b2 * 1 = b0 + b2

So now there are three parameters to represent two 
predicted values, which means that one of these is
unidentified. For example we could think that b0 is
2, than b1 is the predicted y - 2 for Bland and b2
is the predicted y - 2 for Aistan. Or we could think
that b0 is 3, than b1 is the predicted y - 3 for
Bland and b2 is the predicted y - 3 for Aistan. You
can see that you can get exactly the same 
predictions for different values of b0, just by 
adjusting the two remaining parameters. There is
thus no way to distinguish the fit of these 
different models. 

In order to be able to estimate the model you must
constrain one of the parameters. Be default we 
constrain the parameter of one of the dummies to
be 0 (i.e. we just exclude that variable from our
model). Alternatively we could constrain the 
constant to be 0, with the -nocons- option.

Anyhow, from your previous question I gathered
that you are not interested in these effects, you
even want to suppress the display of these variables.
In that case I would just stick to the default, all
these models are mathematically equivalent anyhow.
But if you are substantively interested in the 
effects of these variables, than this can sometimes
be a really nice trick that can help the interpretation
of your model. Notice however, that this does not 
change your model, just the way it is displayed.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index