Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Regression Across Two Groups


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Regression Across Two Groups
Date   Wed, 14 Dec 2011 11:12:02 +0100

On Wed, Dec 14, 2011 at 10:36 AM, Muhammad Anees wrote:
> Yes, the earnings (the dependent variable) is a group variable where
> groups represent different levels of earnings (0-5000, 5001-10000, and
> so on). I can treat such types of variables with confidence using GLM
> type regression, but I was concerned with what techniques are
> available to compare different regressions.

As I said before, just add the appropriate dummies and/or
interactions. Consider the example below. In this case we want to
compare foreign and domestic cars.The expected price of a domestic car
with median mileage and repair status is 5,419 dollars. This price
increases by (1.16 - 1)/100%=16% if the car is foreign. This is the
difference(*) in constants of a regression on only foreign cars and a
regression on only domestic cars. For domestic cars a unit increase in
repair status leads to a non-significant (.96-1)*100%= -4% decrease in
price, while a unit increase in mileage leads to a significant
(.92-1)*100%= -8% decrease in price. The effect of repair status
increases (becomes more negative) by a non-significant (1.16-1)*100%=
16% if the car is foreign, and the effect of mileage increases
(becomes more negative) by a just significant (1.05-1)*100%= 5% if the
car is foreign. The latter two effects reflect the difference(*)
between regression coefficients if you estimated separate model for
foreign and domestic cars.

In this example I chose two continuous variables (well I treated rep78
as continuous, which is doubtful, but I did not want to interpret too
many parameters). However, this works in the same way if you have
categorical or ordered independent variables. Just precede those
variable with -i.- instead of -c.-.

(*) Notice that "difference" refers here to a quantification of
"unequalness" rather than a difference in the mathematical sense, i.e.
it is not one parameter minus the other parameter but rather the ratio
of the two parameters. This is a general property of models using a
log link, they compare groups via ratios rather than differences.

*------------ begin example ----------------
sysuse auto, clear

// median mpg = 20
gen cmpg = mpg - 20

// median rep78 = 3
gen crep78 = rep78 - 3
glm price i.foreign##(c.crep78 c.cmpg) , ///
    link(log) eform
*------------ end example -------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index