Richard, Good points. Fortunately, the variables that Ebru is asking about are potential predictor variables. David Hoaglin On Sun, Aug 5, 2012 at 1:52 PM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > At 10:57 AM 8/5/2012, David Hoaglin wrote: >> >> Dear Ebru, >> >> People often analyze data from Likert scales as equally spaced, so you >> can use each of the eight items in your model as a numerical variable, >> with values 1 to 4. You simply need to be aware that you are treating >> the four categories as equally spaced. > > > It is a debatable practice though. Consider the following (warm has 4 > values): > > use "http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta";, clear > reg warm yr89 male white age ed prst > rvfplot > > According to the reference manual discussion of rvfplot, "In a well-fitted > model, there should be no pattern to the residuals plotted against the > fitted > values...Any pattern whatsoever indicates a violation of the least-squares > assumptions." > > Clearly, there is a pattern in the above rvfplot, i.e. you get 4 parallel > straight lines. Further, it isn't unique to this example; any 4 category > dependent variable will show the same thing. > > In fairness, if your dependent variable had 17 possible values, you would > have 17 straight lines -- but your eye probably wouldn't detect that because > everything would seem so cluttered. There is probably some point where there > are enough possible values that violations of OLS assumptions aren't > important, but I would be hesitant to say that point is met with a DV that > only has 4 categories. > >> Earlier you asked about centering those variables. Centering will do >> no harm. As far as the model is concerned, it affects only the >> definition of the intercept. If you do decide to "center" the >> variables, you may want to use one of the four values. If the data on >> an item are not concentrated at one end, you could use 2 or 3 or >> perhaps 2.5 as the centering constant. (In a 5-point Likert scale >> with a neutral category at 3, using 3 would often be a reasonable >> choice.) >> >> When you have the results from the model with the eight separate >> items, you may want to see whether the coefficients for the four items >> within a heading are similar. If they are, and it makes sense, you >> could consider replacing those four items with their sum (or average) >> --- a composite score. >> >> David Hoaglin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

