Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clarification requested about the at() option of -margins-


From   Richard Williams <[email protected]>
To   [email protected], [email protected]
Subject   Re: st: Clarification requested about the at() option of -margins-
Date   Thu, 24 Oct 2013 13:30:30 -0500

At 12:03 PM 10/24/2013, Trevor Zink wrote:
Thanks, Richard.

I had actually run across some of your materials before; they were helpful.
My actual problem is obviously more complex than the simple example I
illustrated with. At what point (of complexity) is -margins- no longer
"just plugging numbers into formulas"? In my actual problem I'm not
using interactions, but I am using multiple regressors and factor variables.

I'm not sure that it ever stops. If you have a bunch of other variables in the model, it may be plugging in means for them, or (if using asobserved) it may be doing calculations on a case by case basis and averaging them. If you give it a specific number it will do the calculation with that number.

Whenever you do these calculations, you can add a qualifier like "assuming the model is correct." The model may not be correct for all numbers, especially numbers that fall out of sample. For example, even if a model is correct for weights that fall between 2000 and 5000 pounds, there is no guarantee that it will be correct for, say, a 10,000 pound car. If you actually had 10,000 pound cars in your sample you might find that after 5,000 pounds the slope changes, or that you need an x^2 term, or whatever.

There are also other weird sorts of calculations that margins can do, like compare the predicted probability of success for a 70 year old who is retired with the predicted probability of success for an 18 year old who also happens to be retired.

So again I would say, you can plug in whatever numbers you want into a formula, but that doesn't mean the results will be right or sensible. But margins is plugging numbers into formulas. You have to think about whether the numbers you are feeding it are sensible.


Thanks
Trevor

On 10/24/2013 10:24 AM, Richard Williams wrote:
At 11:01 AM 10/24/2013, Trevor Zink wrote:
Paul,

Thanks very much for your detailed answer. If I may ask a few follow-up questions to make sure I understand properly...

1) "-margins- converts log-odds (and their slopes) to probabilities (and their slopes) for us". I have spent a lot of time reading about -margins- over the past few weeks, and I can't recall ever hearing it explained like this. Is this really what -margins- does? Simply converts from log-odds back to probability? If so, that is great news--it makes the interpretation of the output much easier. 2) "Although the slope for the log-odds is fixed; that for the probability is not. As 0 and 1 are approached, the slope tends to 0, and the possible values and SE are also constrained". So what you're saying is the fact that the slope goes to 0 at the 0 and 1 isn't because of any extrapolation like I assumed, it's simply a product of mapping onto the logic function? 3) You used both -margins, at()- and -margins, dydx() at()-. My understanding of the difference after reading your answer is that -margins, at()- gives the /probability/ of Y==1 at the specified values of X. Whereas -margins, dydx() at()- gives the /change in the probability/ of Y==1 from an infintesimal change in X at the specified values of X. Correct?

(as a side note, I wouldn't have expected weight to predict foreign vs domestic as well as it does)

Thanks again for your answer.
Trevor

Trevor, particularly for a simple problem like yours, margins is just plugging numbers into formulas. So, for example, if you had a formula like

y = 2 + 3*x

you could plug in whatever value you wanted for x (including a totally nonsensical one, e.g. a negative value for weight) and you could get a value for y. Margins doesn't know or care whether the numbers are sensible or not. Sometimes it is realistic to go a bit outside the observed sample range, e.g. try a 5,000 pound car, but in this case it would be silly to go up to 100,000 pounds. But you have to figure that out, not margins.

If you want to know more about how margins works, see

http://www3.nd.edu/~rwilliam/xsoc73994/Margins01.pptx

http://www3.nd.edu/~rwilliam/xsoc73994/Margins02.pdf

http://www3.nd.edu/~rwilliam/xsoc73994/Margins03.pdf



On 10/24/2013 3:13 AM, Seed, Paul wrote:
Dear Statalist,
Trevor Zink asks why -margins- does not behave as he would expect following
logistic regression.

The answer is found only by going back to exactly what logistic regression
actually does; and how it compares to linear regression.

Linear regression is carried out under an assumption of constant slope, and has no problem
therefore in estimating the slope at any value of the predictors.
With a single predictor,
the estimated slope does not change. (Point 1 of Stata output).

However, it is inappropriate for a binary outcome, as it can lead to estimated proportions beyond 0 and 1.
(Point 2).

Logistic regression solves this by working with the log-odds, rather than the probability. There are no impossible values. Extreme log-odds correspond to probabilities close to 0 or 1. -margins- converts log-odds (and their slopes) to probabilities (and their slopes) for us. (Point 3)

Although the slope for the log-odds is fixed; that for the probability is not. As 0 and 1 are approached, the slope tends to 0, and the possible values and SE are also constrained. (Point 4)

Plotting the estimated values against weight reveals this quite clearly. (Point 5)

The code below uses Trevor's example (amended and expanded).

************** Begin Stata code*******************
set more off
sysuse auto, clear

gen wt_tons = weight/2240
* Change units to make results easier to understand
summarize wt_tons
* maximum weight is 2.1607 tons

regress foreign wt_tons
margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 20))
* Point 1

margins, at(wt_tons=(0(0.2)2 ))
* Point 2

logit foreign wt_tons
margins, at(wt_tons=(0(0.2)2 ))
* Point 3

margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 ))
* Point 4

predict Foreign if foreign
predict USA if !foreign

label var Foreign Foreign
label var USA USA
label var wt_tons "Car weight (tons)"

gr7 Foreign USA wt_tons, xlab(0 1 2) ylab(0 .5 1.0) l1title("Estimated probability of car being foreign")
* Point 5
**************** End Stata code *************

Best wishes,

Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners
(+44) (0) 20 7188 3642.



Date: Wed, 23 Oct 2013 23:21:13 -0700
From: Trevor Zink <[email protected]>
Subject: st: Clarification requested about the at() option of -margins-

Long-time lurker, first-time post. I couldn't find a good explanation in
the archives.

I'm confused about what, specifically, -margins- is doing with the at()
option, such that it can calculate margins for values of variable that
don't exist in the data. To articulate with an example:

sysuse auto
summarize weight //maximum weight is 4840
logit foreign weight  //nonsensical, but ok for the example
margins, dydx(weight) at(weight=(0(1000)10000 100000))

Here I ask for the slope of the function at a variety of weights from 0
to 10,000 and also 100,000. The maximum weight observed in the data is 4840.

My understanding of -margins- with at() was that it calculates the slope
of the function holding the specified variables constant at the
specified levels. But if the specified level doesn't appear in the data,
how can Stata determine what the slope is at this value? Ok, it's
clearly extrapolating, but based on what information? The only other
information included in the above model is a constant. When I try the
above but specifying the nocons option to -logit- Stata returns an
error, so it must be forecasting based on the constant; but specifically
how?

What's even more strange to me is that the standard errors *shrink* as
the estimates extend beyond the observed data. If Stata is forecasting
based on only the constant this seems counter-intuitive to me.

Thanks, and sorry if this is silly.

Trevor Zink

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index