Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Multilevel Models with Skewed Outcome using -runmlwin-


From   "Emmott, Emily" <emily.emmott.10@ucl.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: Multilevel Models with Skewed Outcome using -runmlwin-
Date   Wed, 5 Sep 2012 19:23:15 +0000

Hello List,

 I am currently using a large dataset with information on children's school test scores from multiple occasions (i.e., test nested in children). The test scores were collected as integers which range from 0 to 15.

 The problem I have is that the scores are negatively skewed:

. sum TestScore, de

                         Test Score
-------------------------------------------------------------
      Percentiles      Smallest
 1%            2              0
 5%            5              0
10%            6              0       Obs                7619
25%            8              0       Sum of Wgt.        7619

50%           11                      Mean           10.28376
                        Largest       Std. Dev.      2.985888
75%           13             15
90%           14             15       Variance       8.915529
95%           14             15       Skewness      -.5706405
99%           15             15       Kurtosis       3.080436


 I have tried transforming the scores (ln, sqrt etc), but none seem to transform it into normality.

 Now, I have been advised that in multilevel models, skewness is less problematic, and as long as the outcome does not display excess kurtosis I should be ok to carry out a multilevel normal regression model. However, I have not been able to find any papers to support this, so I was not sure if I could fully trust the advice.

 I am currently using the -runmlwin- command, and estimate the models using MCMC estimation.

 I have tried two methods, the first to simply ignore the skew and run a multilevel normal regression. The second to categorise the Test Scores into 4 categories and run a multilevel ordered regression model. (Both using runmlwin & MCMC).

 In both cases the results are very similar, where the direction of the effects & whether the predictor is significant at the p<0.5 level are practically the same across both methods- which made me think maybe it was ok to keep the analysis as a multilevel normal regression, as the standard errors do not seem to be inflated in the normal regression (I'd read somewhere that skewed outcomes increase standard error thus Type1 error?).

 So, my question is, is it ok in my situation to carry out a multilevel normal regression on the test scores despite the skew?

 Furthermore, I had an additional question- Does it make a difference by using the MCMC estimation? I suspected that MCMC may produce relatively accurate estimates despite the skewed outcome.

Thank you in advance,
Emily Emmott

Department of Anthropoloy
University College London



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index