Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Is this the right code if I want to compare group 1 vs group 4 in a logistic regression model?


From   Alfonso Sanchez-Penalver <alfonso.statalist@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Is this the right code if I want to compare group 1 vs group 4 in a logistic regression model?
Date   Wed, 4 Dec 2013 11:02:10 -0500

No problem Laura. I hope my explanation helped.

Alfonso Sanchez-Penalver

> On Dec 4, 2013, at 10:48 AM, "Meems, LMG" <l.m.g.meems@umcg.nl> wrote:
> 
> Nick and Alfonso,
> 
> Both, thank you for the answers. In the future, I'll watch out for the 'little' words. Hopefully, next time, it will lead to less confusion and discussion.
> 
> Regards,
> 
> Laura
> 
> -----Oorspronkelijk bericht-----
> Van: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Namens Alfonso Sánchez-Peñalver
> Verzonden: woensdag 4 december 2013 16:35
> Aan: Stata List
> Onderwerp: Re: st: Is this the right code if I want to compare group 1 vs group 4 in a logistic regression model?
> 
> Hi Laura,
> 
> I agree with Nick that you have asked the question backwards which is why I was confused. In your model (logit or ordered logit) the response variable is the categorical variable, and the explanatory ones would be age and sex. Logit or ordered logit allow you to estimate the probabilities (or the log-odd ratios) of belonging to any of the categories. What you will be able to find is whether increasing your age increases (or decreases) the probability of belonging to the higher or the bottom group, for example. So let's say your response variable was income and you had broken it down in different groups that make sense to you because they represent different social classes, for example. You would expect that as people grow up they increase their income, and thus, you would expect that the higher income groups would have people with higher ages than the lower income groups, in general. Logit or ordered logit would allow you to estimate by how much the probability of belongin!
 g !
> to one group increases (or decreases) by age increasing by 1 year let's say. Thus in my example we would expect the probabilities of belonging to the higher income groups to increase with age, and the probabilities of belonging to the lower income groups to decrease with age.
> 
> Ordered logit simply takes into account the natural ranking of the categories. In the income example, belonging to a higher income group has more meaning that simply being in that category. It means that your income is higher and thus has some more information. Ordered logit captures this.
> 
> Best regards,
> 
> Alfonso
> 
>> On Dec 4, 2013, at 10:12 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> 
>> Usually the wrong way round: in your example, age and sex are
>> predefined or given, and the question is what they imply.
>> 
>> Sometimes this is causal (as a male I could never have had babies) but
>> more commonly it is a matter of association (e.g. implications of age
>> for experience or stamina).
>> 
>> The effects _on_ age and sex of anything are limited, I believe, to
>> what can be done surgically.
>> 
>> This point may be just a consequence of your choosing the wrong small
>> words, but as you are likely to be writing in English it is important
>> to get this straight.
>> 
>> On ordered logit, come on please! Typing -search ordered logit- in
>> Stata shows that you are sitting right by several resources.
>> 
>> Nick
>> njcoxstata@gmail.com
>> 
>> 
>>> On 4 December 2013 14:54, Meems, LMG <l.m.g.meems@umcg.nl> wrote:
>>> Hi Alfonso,
>>> 
>>> Thank your for the answer. I'm sorry my question has been that confusing, I'll try to explain it once again.
>>> 
>>> What I want to know (and I thought the logistic regression model suited the best to get this answer) is how belonging to a certain group (let's say low vs high) results in effects on age and sex (just 2 examples. In my model I have plenty of other variables which I also want to test).
>>> For example, if people in the lower group are significantly at a different age and sex than people in the higher group.
>>> 
>>> Btw, I'm not familiar with ordered logit. What is it exactly?
>>> 
>>> Best,
>>> 
>>> Laura Meems
>>> 
>>> -----Oorspronkelijk bericht-----
>>> Van: owner-statalist@hsphsun2.harvard.edu
>>> [mailto:owner-statalist@hsphsun2.harvard.edu] Namens Alfonso
>>> Sanchez-Penalver
>>> Verzonden: woensdag 4 december 2013 15:36
>>> Aan: statalist@hsphsun2.harvard.edu
>>> Onderwerp: Re: st: Is this the right code if I want to compare group 1 vs group 4 in a logistic regression model?
>>> 
>>> Hi Laura,
>>> 
>>> You mention you break up a continuous variable into four categories and then use a logit regression. I believe in this case ordered logit would be more appropriate, since the categories follow the natural order of the continuous variable.
>>> 
>>> Having said that I am a bit confused about your main question. You say "I want to compare the lowest group (0) with the highest group (3) and the effects on age and sex". I thought the groups were the response variable, because the logit model would allow you to calculate effects on belonging to a group or another. Did you mean you want to know what the difference in the effects that age and sex would have on the probability of belonging to the lowest group and the probability of belonging to the highest group? If so, or something similar, you can use margins after the ordered logit regression to estimate the effects on the probabilities of belonging to each of the groups of any variable of interest and then take the difference for the groups you want.
>>> 
>>> Sorry if I misunderstood your message, please let me know if my interpretation is what you were after.
>>> 
>>> Best,
>>> 
>>> Alfonso Sanchez-Penalver
>>> 
>>>> On Dec 4, 2013, at 9:15 AM, "Meems, LMG" <l.m.g.meems@umcg.nl> wrote:
>>>> 
>>>> Hello Statalisters,
>>>> 
>>>> After a couple of days filled with STATA and database work, I really need a check if what I'm doing is right..
>>>> 
>>>> At the moment I'm looking at the predicted effect from a continuous variable (Y) on a couple of other parameters.
>>>> I decided to split the continuous variable in 4 groups: thereby following it's clinical reference values (e.g. sufficient, insufficient etc.).
>>>> 
>>>> After this step I wanted to fit this variable in a regression model, using logistic regression (as I thought that dividing it in groups turned the continuous variable into a categorical one..). So far, so good..
>>>> 
>>>> However, let's say I now want to compare the lowest group (0) with the highest group (3) and the effects on age and sex.
>>>> The code I used to do this is:
>>>> Char (Y) [omit] 3
>>>> Xi: logit i.Y + age sex
>>>> 
>>>> This resulted in coefficients for age and sex, but also resulted in 2 ommitted values, namely group 1 and 2. With the comment that group 1 and 2 !=0 and predicted failure perfectly.
>>>> 
>>>> So, this result made me doubting about the code. Is this the right code to use and what exactely do these 2 ommitted values mean? Is it a result from the code I made (that would be the good scenario) or is it something wrong and should I correct for it (or even correct the code)?
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> ________________________________
> De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken van dit bericht, het niet openbaar maken of op enige wijze verspreiden of vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een incomplete aankomst of vertraging van dit verzonden bericht.
> 
> The contents of this message are confidential and only intended for the eyes of the addressee(s). Others than the addressee(s) are not allowed to use this message, to make it public or to distribute or multiply this message in any way. The UMCG cannot be held responsible for incomplete reception or delay of this transferred message.
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index