# Re: st: Test selection for effects of a number of variables on a dependent variable

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: Test selection for effects of a number of variables on a dependent variable Date Wed, 12 Jun 2013 11:49:38 +0100

```Several confusions here.

Quite what you mean by "factor variable" is unclear (I surmise that
you may be confusing Stata with R, for example), but, regardless,
-regress- allows any kind of numeric variable to be the response, even
down to the level of a binary variable. How far that is a good idea
depends on the circumstances and Stata trusts you to make a sensible
decision.

There is no limitation whereby -mlogit- is restricted to 3 outcomes.
What prompted that? Numerous different outcomes may be difficult or
even impossible to model comfortably or at all, but that's a different
matter.

What's crucial is exactly what "stage of diagnosis" means. It could,
for all I know, a graded or ordinal scale, in which case I would try
-ologit- first.

I have to recommend _much_ more reading before analysis. These are
basic misunderstandings of statistics and indeed Stata. The textbooks
of Alan Agresti are excellent choices.

Nick
njcoxstata@gmail.com

On 12 June 2013 11:22, Tim Evans <Tim.Evans@wmciu.nhs.uk> wrote:
> Hi all,
>
> I am using Stata 11.2, and have a cohort of patients diagnosed with a particular type of cancer and have a couple of questions to ask of my data. The first relates to the effects of certain variables (social isolation, deprivation, stage at diagnosis etc) on survival time - this I think I have sorted out. However, one question I have to answer is:
>  'does ethnicity or social isolation have an effect on stage at diagnosis (controlling for the other factors).'
>
> My stage at diagnosis variable has 6 different values, social isolation has 4 different variables and ethnicity has 5 variables. Originally I thought I would be able to model this using -mlogit-, but this appears to me, to only be appropriate when the dependent variable has 3 different values (and mine has more). I naively tried to use -regress- but the dependent variable cannot be a factor variable and the data I have are not categorised in a continuous manner.
>
> Therefore, does anyone have any suggestions on how I might be able to model the effect of one of my variables (ethnicity or social deprivation) on stage at diagnosis?
>
>
> Best wishes
>
> Tim
>
