[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: st: problem with marginal effect after running a logit regression
Rieza Soelaeman <firstname.lastname@example.org>
Re: Re: st: problem with marginal effect after running a logit regression
Mon, 30 Jul 2012 12:56:25 -0500
Also, if you use the atmeans, the estimation sets the other variables
at the ***mean value*** for that variable. If your variables range
from 0 to 1, the mean value is the proportion of people in your
dataset having that characteristic.
Supposing in your dataset tabbing educational categories as:
Educ n pct
Low 100 .25
Medium 200 .50
High 100 .25
For the estimation, Stata will use 0.50 for mstudymid and 0.25 for
mstudyhigh. How do we interpret what that average characteristic
"means" (no pun intended)?
On Mon, Jul 30, 2012 at 12:30 PM, Rieza Soelaeman <email@example.com> wrote:
> Hi Jeremy,
> My only other advice is to be careful and understand what you are
> asking of Stata when you run these options for the margins command.
> When you ask for dydx(varlist) atmeans, Stata calculates the marginal
> effect of **going from 0 to 1 for those variables** (read the table
> footnotes Stata generates). As written below, your model still does
> not allow you to estimate the marginal effect of going from medium to
> high education, but compares medium with reference and high with
> I urge you also to discuss the output with your advisor to make sure
> it makes sense (and that you did what he asked you to do)--that's what
> advisors are for, after all!
> On Mon, Jul 30, 2012 at 3:22 AM, Jeremy Franklin <firstname.lastname@example.org> wrote:
>> Hi Rieza,
>> First of all thank you for considering my problem and for your big answer that shed light on the issue i was facing.
>> My advisor told me to use mfx function at median values for all the characteristics in my model.
>> As you pointed, using the "old" mfx function was not the right choice as far as "mfx continues to work but does not support factor variables" cf Stata Help
>> Nevertheless, I finally found (with he precious help of some statalisters) the formula to compute the marginal effects for my logit model, namely:
>> margins, dydx(mstudymid mstudyhigh mhomme mchiefwageearner mage28_37
>> mage38_47 mage48_57 mage58 mintpollow mintpolmid mintpolhigher mpolleft
>> mpolright mincomemid mincomehigh) atmeans
>> I also computed the marginal effects for 5 more models with and without some control variables in order to determine when the effect is the highest.
>> Regarding S002 and S003, these are also control variables. Being respectively the country of respondents and the number of the wave when the respondents where interviewed, it allows me to make my model with country fixed, wave fixed and country-wave fixed effects. I did not need to know the specific marginal effects of these variables and it appears that with the previous formula, these were not computed.
>> Further comments on this method are more than welcome.
>> Thank you again for your help Rieza;
>>>Your advisor is correct that the coefficients of a logistic regression
>>>cannot be interpreted in the same way as OLS. Using the margins
>>>command allows for an estimation of the marginal effect (e.g. the
>>>increase in probability of your outcome = 1, here I assumed outcome is
>>>binary). One question for you: when your advisor meant by "at median,"
>>>did he mean at median values for all the characteristics in your
>>>model, or just the median level of education?
>>>If the specific effect of interest is going from mstudymid to
>>>mstudyhigh, I would suggest making mstudymid the reference category in
>>>your set of dummy variables for education. Here I assume you have
>>>mstudylow as the reference (excluded) category. If you make mstudymid
>>>your reference, then the marginal effect of mstudyhigh would be the
>>>marginal effect of going from mstudymid to mstudyhigh. Similarly, the
>>>marginal effect of mstudylow would be the marginal effect of going
>>>from mstudylow to mstudymid.
>>>Typically, if your predictors are continuous, it makes sense to have
>>>Stata calculate marginal effects at the means of each value of your
>>>predictors. This can be achieved by executing the following command
>>>after running your regression:
>>>However, because your predictors are categorical (or if you are using
>>>a version of Stata before Stata 12), you may be able to get away with
>>>specifying criteria for the "typical" individual in your dataset for
>>>which you are calculating the marginal effect. Then justify the
>>>choices you made in describing the "typical" individual.
>>>For example, in your dataset, the "typical" individual may be a 35
>>>year old, male, who is a chief wage earner, with high education,
>>>mintpol = "mid", mpol = "right", and mincome = "high," then the
>>>command you would run would be something like:
>>>mfx, at (mstudymid=0 mstudyhigh=1 mhomme=1 mchiefwageearner=1 mage28_37=1
>>>mage38_47=0 mage48_57=0 .............. mincomehigh=1)
>>>*Note the ........... means you should assign a 0 or 1 value for your
>>>categorical predictors as appropriate to describe your person.
>>>I see there are several variables in your dataset that could benefit
>>>from being continuous, though. If age were continuous, you can simply
>>>plug in the average age (from any of the univariate commands you can
>>>use to describe the mean of a vbl). Same thing with income. I think
>>>it would make your regression more robust to use the continuous.
>>>Of course using this method (with -mfx-) is complicated by the
>>>clustering in your data and the interactions between the cluster
>>>variables S003 and S002 (it appears to me these are polychotomous
>>>categorical variables, as you have used the i. in adding them to your
>>>regression). Because I don't know what they represent and how many
>>>levels of each they are, I am not sure how they would be specified in
>>>the -mfx- command. Do you absolutely need to know the marginal effect
>>>of each of those clusters, or were they included just so you can
>>>control for them? If you included them just to control for them,
>>>consider using -xtmelogit- (mixed effects logit) instead, and specify
>>>S003 and S002 for random intercept calculation.
>>>*I invite other statalisters to correct me if I have said something in error
>>>On Thu, Jul 26, 2012 at 2:17 PM, Jeremy Franklin <email@example.com> wrote:
>>>> Dear all,
>>>> Here is my little trouble:
>>>> For my master degree thesis I decided to test for the role of education level in assession the importance of fighting inflation.
>>>> Here is my final regression formula:
>>>> xi: logit mfirstchoice mstudymid mstudyhigh mhomme mchiefwageearner mage28_37 mage38_47 mage48_57 mage58 mintpollow mintpolmid mintpolhigher mpolleft mpolright mincomemid mincomehigh i.s003 i.s002 i.s003*i.s002, vce(cluster s003)
>>>> I hate the results but my thesis coordinator told me that the results of logit regression cannot be interpreted like coefficients of a linear regression. Therefore, he suggested me to check for the marginal effects at the median in order to see the marginal effects of one individual coming from mstudymid to mstudyhigh
>>>> I googled everything, i tried hundreds of formulas, both with mfx and margins but i still cannot find the correct one in order to interpret my results.
>>>> Can ANYONE help me please.
>>>> ps: a robustness test included in my thesis include the following formula (this time with ologit)-
>>>> xi: ologit minflation mstudymid mstudyhigh mhomme mchiefwageearner mage28_37 mage38_47 mage48_57 mage58 mintpollow mintpolmid mintpolhigher mpolleft mpolright x047 i.s003 i.s002 i.s003*i.s002, vce(cluster s003)
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>* For searches and help try:
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: