 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Checking to see if the association between two variables is linear or otherwise

 From "JVerkuilen (Gmail)" To statalist@hsphsun2.harvard.edu Subject Re: st: Checking to see if the association between two variables is linear or otherwise Date Fri, 12 Oct 2012 20:36:01 -0400

```On Fri, Oct 12, 2012 at 5:56 PM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote:

> I'm trying to figure out if linear regression is the appropriate choice for my research question - I would like to analyze the association of BMI and education (BMI is continuous and education categorical). Ideally I would just run a linear regression with BMI as the outcome and education as the principle explanatory variable.
>
> However my hypothesis is that low educated people are both likely to have a low and a high BMI, i.e. the association between education and BMI is probably more 'u shaped' than linear.
>
> What is the best way to check if the association between a continuous and categorical variable is linear or otherwise...? Preferably, I would like to be able to plot such a shape using Stata.
>

You are making a prediction that involves both location (i.e., the
mean) and the dispersion (i.e., the variance). There are a number of
models that can accommodate this kind of pattern. , but you might want
to consider using simultaneous quantile regression (-sqreg-). I also
did some looking and found -reghv-. This allows you to use
heteroscedasticity covariates in a regression model with
multiplicative variance. Finally, you might consider -betafit-, which
similarly allows the use of heteroscedasticity covariates, though
you'd need to linearly rescale the data to be in the unit interval.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```