Stata: Data Analysis and Statistical Software
   >> Home >> Resources & support >> FAQs >> The Rasch model in Stata

Can Stata estimate a Rasch model?

Title   The Rasch model in Stata
Author Jeroen Weesie, Department of Sociology/ICS, Utrecht University
Date January 1999; minor revisions July 2011

Background

The Rasch model is one of the dominant models for binary items (e.g., success/failure on math problems) in psychometrics. Let y(ij) be the binary response, where i=1,...,n, n = #subjects, j=1, ..., m, m = #items. (Sometimes “unbalanced designs” are used in which subjects respond to only a subset of the items. The description below applies without modification.) The Rasch model can be written as a logit-linear model:

    logit Pr(y(ij)=1 | eta(i)) = eta(i) − theta(j)	

Here eta(i) can be interpreted as a person’s ability parameter and theta(j) as an item-difficulty parameter. Moreover, it is assumed that, conditional on eta(i), the y(i*) are independent (“local independence”). Actually, Rasch, a Danish statistician, gave an axiomatic derivation of the model in the 1960s, in which, next to local independence, the main characterizing properties were that the { y(+j) , y(i+) } form a sufficient statistic for { eta, theta }.

Technical note:   When treating the etas and thetas as parameters (fixed effects), it has long been known that the maximum likelihood estimators are inconsistent in the standard asymptotic setup (n→infinity, m fixed). This may become somewhat intuitive if we observe that with each extra subject we have m extra observations but also one extra eta parameter to estimate. Thus, intuitively, the number of observations and hence, the amount of information per parameter, stay almost fixed with increasing n.

A famous example of inconsistent maximum likelihood estimators with a similar structure is the following. Let y(ij) ~ Normal(mu(i),v2), i=1, ..., n, j=1, 2. The mu(i) parameters and v2 are to be estimated. The maximum likelihood estimator for v2 is

            sum{i,j} (y(ij) − y(i.))2 / (2n) 

with expected value and also with probability limit (1/2)v2, not v2.

In the 1980s, Andersen showed the conditional maximum likelihood (CML) estimator of theta(j), where conditioning occurs on the subject’s scores y(i+), is efficient and is actually asymptotically normal distributed, with all the nice properties of ml-estimators in regular situations. For instance, a conditional likelihood-ratio test has properties analogous to those of the standard likelihood-ratio test. While other estimators have been investigated, the CML estimator is the one most widely used for the fixed-effects case—the etas are treated as parameters (Fischer and Molenaar 1995).

Fitting the Rasch model in Stata

Consider the most probable case; you have a dataset in which the cases refer to subjects, and the responses on m items are stored in m dichotomous variables item_1, ..., item_m. You can obtain CLM estimates of the theta parameters of the Rasch model using the conditional logit fixed-effects estimator in clogit (xtlogit, fe is equivalent to typing clogit). This command requires all responses be stored in separate observations, while a “group” variable is used to identify the observations that belong to the same subject. This can be accomplished with the reshape command. Finally, you can describe the Rasch model as a “clogit model” with m covariates x(ijk), k=1, ..., m, so that for all i, x(ijk) = −1 if j=k, and 0 otherwise. The regression coefficient of x(..k) is just theta(k).

In a conditional logit model, effects of variables that are constant within groups (subjects) are not identified. Since the x(..k) variables are a complete classification, i.e., they sum to −1 in each record, the thetas are not identified. If all x’s are included in the model, Stata deals with this problem automatically by removing one of the x variables. You can also withdraw one of the x variables from the set of predictor variables. The associated theta parameter could be interpreted as being fixed at 0.

The outline for estimating the CML estimates of the theta parameters of the Rasch model follows.

(1) If necessary, transform the data into long format.

(2) Create explanatory variables for the thetas.

(3) Estimate the theta parameters.

The “Bayesian” approach: random effects

It may come as only a little surprise that psychometricians also studied a Gaussian random-effects estimator for the thetas. In Stata, this is obtained by the random-effects panel estimator, as computed by the xtlogit, re command. This command requires the same data organization as clogit.

Example

Consider data on 120 subjects who responded to 10 math problems, coded as 1 (correct) and 0 (incorrect). We want to know whether the 10 math problems involve a one-dimensional scale so that the items can be ordered with respect to difficulty, irrespective of the subjects’ abilities.

 . use http://www.stata.com/support/faqs/dta/raschfaq, clear

 . describe

 Contains data from http://www.stata.com/support/faqs/dta/raschfaq.dta
   obs:           120                          
  vars:            11                          19 May 2005 07:47
  size:         1,920 (99.9% of memory free)
 -------------------------------------------------------------------------------
               storage  display     value
 variable name   type   format      label      variable label
 -------------------------------------------------------------------------------
 math1           byte   %9.0g       Correct    math item 1
 math2           byte   %9.0g       Correct    math item 2
 math3           byte   %9.0g       Correct    math item 3
 math4           byte   %9.0g       Correct    math item 4
 math5           byte   %9.0g       Correct    math item 5
 math6           byte   %9.0g       Correct    math item 6
 math7           byte   %9.0g       Correct    math item 7
 math8           byte   %9.0g       Correct    math item 8
 math9           byte   %9.0g       Correct    math item 9
 math10          byte   %9.0g       Correct    math item 10
 subj_id         int    %9.0g                  
 -------------------------------------------------------------------------------
 Sorted by:  

First, to get some feeling for the data, we describe the items

 . summarize math*
 
     Variable |       Obs        Mean    Std. Dev.       Min        Max
 -------------+--------------------------------------------------------
        math1 |       120    .8083333    .3952626          0          1
        math2 |       120        .775    .4193332          0          1
        math3 |       120    .6833333    .4671266          0          1
        math4 |       120         .55    .4995797          0          1
        math5 |       120    .5333333    .5009794          0          1
 -------------+--------------------------------------------------------
        math6 |       120         .45    .4995797          0          1
        math7 |       120        .425    .4964157          0          1
        math8 |       120          .3     .460179          0          1
        math9 |       120    .2583333    .4395535          0          1
       math10 |       120    .1666667    .3742406          0          1

They are already labeled so that the proportion of students that answer the item correctly decreases. The subjects are aptly described via the total number of correct answers.

 . egen score=rowtotal(math*)

 . tabulate score

       score |      Freq.     Percent        Cum.
 ------------+-----------------------------------
           0 |          3        2.50        2.50
           1 |          5        4.17        6.67
           2 |          4        3.33       10.00
           3 |         16       13.33       23.33
           4 |         21       17.50       40.83
           5 |         24       20.00       60.83
           6 |         22       18.33       79.17
           7 |         13       10.83       90.00
           8 |          4        3.33       93.33
           9 |          6        5.00       98.33
          10 |          2        1.67      100.00
 ------------+-----------------------------------
       Total |        120      100.00

To fit the Rasch model, we first have to reshape the data.

 . reshape long math, i(subj_id) j(item)
 (note: j = 1 2 3 4 5 6 7 8 9 10)
 
 Data                               wide   ->   long
 -----------------------------------------------------------------------------
 Number of obs.                      120   ->    1200
 Number of variables                  12   ->       4
 j variable (10 values)                    ->   item
 xij variables:
                  math1 math2 ... math10   ->   math
 -----------------------------------------------------------------------------

We now have to generate the predictor variables for the thetas:

 . forvalues num =1/10{
      gen Th`num' = -(item==`num')
   }

To illustrate what the data now look like, we list the cases.

 * set format to compress the output of list
 . format math item Th* %4.0f
 
 * sort within subj_id on the identifier item of the math problem.
 . sort subj_id item
 
 * invoke list with options that improve readability.
 . list math item Th* if sub==1, nodisplay noobs nolabel  

   +--------------------------------------------------------------------------+
   | math   item   Th1   Th2   Th3   Th4   Th5   Th6   Th7   Th8   Th9   Th10 |
   |--------------------------------------------------------------------------|
   |    1      1    -1     0     0     0     0     0     0     0     0      0 |
   |    1      2     0    -1     0     0     0     0     0     0     0      0 |
   |    1      3     0     0    -1     0     0     0     0     0     0      0 |
   |    0      4     0     0     0    -1     0     0     0     0     0      0 |
   |    0      5     0     0     0     0    -1     0     0     0     0      0 |
   |--------------------------------------------------------------------------|
   |    0      6     0     0     0     0     0    -1     0     0     0      0 |
   |    0      7     0     0     0     0     0     0    -1     0     0      0 |
   |    0      8     0     0     0     0     0     0     0    -1     0      0 |
   |    0      9     0     0     0     0     0     0     0     0    -1      0 |
   |    0     10     0     0     0     0     0     0     0     0     0     -1 |
   +--------------------------------------------------------------------------+
 
 . list math item Th* if sub==2, nodisplay noobs nolabel 
 
   +--------------------------------------------------------------------------+
   | math   item   Th1   Th2   Th3   Th4   Th5   Th6   Th7   Th8   Th9   Th10 |
   |--------------------------------------------------------------------------|
   |    1      1    -1     0     0     0     0     0     0     0     0      0 |
   |    1      2     0    -1     0     0     0     0     0     0     0      0 |
   |    0      3     0     0    -1     0     0     0     0     0     0      0 |
   |    0      4     0     0     0    -1     0     0     0     0     0      0 |
   |    1      5     0     0     0     0    -1     0     0     0     0      0 |
   |--------------------------------------------------------------------------|
   |    1      6     0     0     0     0     0    -1     0     0     0      0 |
   |    1      7     0     0     0     0     0     0    -1     0     0      0 |
   |    0      8     0     0     0     0     0     0     0    -1     0      0 |
   |    0      9     0     0     0     0     0     0     0     0    -1      0 |
   |    0     10     0     0     0     0     0     0     0     0     0     -1 |
   +--------------------------------------------------------------------------+

Thus, we see that the subject with subj_id=1 has 10 records associated with his response: the score for each item is in variable math. The remaining 10 variables, Th1Th10, which are included in the output, are the independent variables. The Th#s are nearly everywhere 0. We see that the response on the first item (item>==1) is modeled only with the predictor variable Th1, the response on the second item (item>==2) with the second predictor Th2, etc.

We are now ready to request the CML estimates of the theta parameters of the Rasch model using the clogit command:

 . * clm-estimator for Rasch model (etas are fixed effects)
 . clogit math Th2-Th10, group(subj_id)
 note: multiple positive outcomes within groups encountered.
 note: 5 groups (50 obs) dropped due to all positive or
       all negative outcomes.
 
 Iteration 0:   log likelihood = -436.11778  
 Iteration 1:   log likelihood =   -435.352  
 Iteration 2:   log likelihood = -435.35069  
 Iteration 3:   log likelihood = -435.35069  
 
 Conditional (fixed-effects) logistic regression   Number of obs   =       1150
                                                   LR chi2(9)      =     243.80
                                                   Prob > chi2     =     0.0000
 Log likelihood = -435.35069                       Pseudo R2       =     0.2188
 
 ------------------------------------------------------------------------------
         math |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
          Th2 |    .241266   .3481052     0.69   0.488    -.4410077    .9235397
          Th3 |   .7921615   .3318989     2.39   0.017     .1416515    1.442672
          Th4 |   1.450772   .3241392     4.48   0.000     .8154703    2.086073
          Th5 |   1.528206   .3239802     4.72   0.000     .8932166    2.163195
          Th6 |   1.913438   .3253857     5.88   0.000     1.275694    2.551183
          Th7 |   2.030632   .3265377     6.22   0.000     1.390629    2.670634
          Th8 |   2.662249   .3389274     7.85   0.000     1.997964    3.326535
          Th9 |   2.904665   .3467128     8.38   0.000     2.225121     3.58421
         Th10 |    3.55384   .3771324     9.42   0.000     2.814674    4.293006
 ------------------------------------------------------------------------------

We conclude that the first item (Th1) with associated Theta(1)=0 is the easiest item, to which the most subjects responded correctly. Item 10 is the most difficult item, to which the fewest subjects responded correctly. clogit has dropped five subjects from the analysis. These subjects responded either to all items correctly or to all items incorrectly; in a conditional likelihood, these subjects carry no information about the difficulty of the items.

Andersen has suggested a specification test, namely that the difficulty parameters are the same for the “good” subjects and the “poor” subjects, distinguished via their total score. While Andersen subsequently employed a (conditional) likelihood-ratio test, we will proceed by illustrating the new hausman command. The hausman test can be used to compare the theta estimates obtained from the full sample with the theta estimates obtained from the “good” or “bad” students.

 . estimates store RASCH

 . clogit math Th2-Th10 if score<=5, group(subj_id)
(output omitted)

 . estimates store LESS

 . hausman LESS RASCH 

                  ---- Coefficients ----
              |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
              |      LESS        RASCH        Difference          S.E.
 -------------+----------------------------------------------------------------
          Th2 |    .1507434      .241266       -.0905226         .172672
          Th3 |    1.024031     .7921615        .2318693        .1606061
          Th4 |    1.358351     1.450772       -.0924202        .1760475
          Th5 |    1.414312     1.528206       -.1138945        .1773457
          Th6 |    1.822396     1.913438       -.0910427        .1907797
          Th7 |    2.014378     2.030632        -.016254        .2016512
          Th8 |    3.414959     2.662249        .7527093        .3793093
          Th9 |    3.102924     2.904665        .1982584        .3122054
         Th10 |    3.414959      3.55384       -.1388815         .341348
 ------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from clogit
            B = inconsistent under Ha, efficient under Ho; obtained from clogit
     
     Test:  Ho:  difference in coefficients not systematic
 
                   chi2(9) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                           =       11.72
                 Prob>chi2 =      0.2293

We conclude that the difficulty of items does not appear to be different for “poor” subjects. Similarly,

 . clogit math Th2-Th10 if score>=5, group(subj_id)
 (output omitted)
    
 . estimates store MORE

 . hausman MORE RASCH 
 
                  ---- Coefficients ----
              |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
              |      MORE        RASCH        Difference          S.E.
 -------------+----------------------------------------------------------------
          Th2 |     .146233      .241266       -.0950331        .4148311
          Th3 |    .5045323     .7921615       -.2876292        .3871818
          Th4 |    1.307562     1.450772       -.1432097        .3401682
          Th5 |    1.241154     1.528206       -.2870522         .343086
          Th6 |    1.907459     1.913438       -.0059791        .3258093
          Th7 |     1.73504     2.030632       -.2955911        .3263267
          Th8 |    2.250694     2.662249       -.4115551        .3131327
          Th9 |    2.610758     2.904665       -.2939078        .3127668
         Th10 |    3.488208      3.55384       -.0656323        .3271359
 ------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from clogit
            B = inconsistent under Ha, efficient under Ho; obtained from clogit
 
     Test:  Ho:  difference in coefficients not systematic
 
                   chi2(9) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                           =        8.88
                 Prob>chi2 =      0.4483

and again we find no significant differences in the item difficulties.

Finally, we can also obtain the estimates for the thetas if we assume that the etas are random effects.

 . * xtlogit estimator for Rasch (etas are Gaussian random effects)
 . xtset subj_id
 . xtlogit math Th1-Th10, noconstant re
 (iteration log omitted)

 Random-effects logistic regression              Number of obs      =      1200
 Group variable (i): subj_id                     Number of groups   =       120
 
 Random effects u_i ~ Gaussian                   Obs per group: min =        10
                                                                avg =      10.0
                                                                max =        10
 
                                                 Wald chi2(10)      =    183.95
 Log likelihood  = -695.52405                    Prob > chi2        =    0.0000
 
 ------------------------------------------------------------------------------
         math |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
          Th1 |  -1.669049   .2640865    -6.32   0.000    -2.186649   -1.151449
          Th2 |  -1.439218   .2510217    -5.73   0.000    -1.931211   -.9472242
          Th3 |  -.9006992   .2289021    -3.93   0.000    -1.349339   -.4520594
          Th4 |  -.2355981   .2161706    -1.09   0.276    -.6592846    .1880884
          Th5 |  -.1566347   .2156563    -0.73   0.468    -.5793133    .2660439
          Th6 |   .2365609   .2161699     1.09   0.274    -.1871243    .6602462
          Th7 |   .3559474   .2173414     1.64   0.101    -.0700339    .7819286
          Th8 |   .9920522   .2318226     4.28   0.000     .5376883    1.446416
          Th9 |   1.231196   .2410622     5.11   0.000     .7587231    1.703669
         Th10 |   1.860508   .2766864     6.72   0.000     1.318212    2.402803
 -------------+----------------------------------------------------------------
     /lnsig2u |  -.1718778   .2528068                     -.6673701    .3236145
 -------------+----------------------------------------------------------------
      sigma_u |   .9176503   .1159941                      .7162793    1.175634
          rho |   .2037977   .0410215                      .1349109    .2958304
 ------------------------------------------------------------------------------
 Likelihood-ratio test of rho=0: chibar2(01) =    55.65 Prob >= chibar2 = 0.000

Here the order of difficulties of the items is not affected by moving from a fixed-effects to a random-effects estimator for the thetas.

A command for Rasch

It is important to carry out specification tests of the model to analyze whether the model fits the data. In the literature, such tests have been suggested and are indeed commonly applied by psychometricians. Some of these tests, as proposed by Andersen, are (conditional) likelihood-ratio tests for the hypotheses that the thetas are the same for groups of subjects formed by an exogenous variable (Do men and women behave the same way?) and by distinguishing groups via the total score: are the difficulty parameters the same for the “smart” subjects (those for whom y(i+) is relatively high) as those for the “dumb” subjects (those with a low total score for y(i+))? CLM’s condition is on y(i+), so it is OK to distinguish cases using y(i+). Above we have applied Hausman tests for this purpose.

Another purpose of a Rasch analysis is to estimate the subject parameter eta. In the fixed effects approach, the etas are commonly estimated by maximum likelihood conditional on the CLM theta-estimates. For the random-effects case, the etas are commonly estimated by posterior means.

Reference

Fischer, H. G. and I. W. Molenaar. 1995.
Rasch Models. Foundations, Recent Developments and Applications. New York: Springer.
Bookmark and Share 
FAQs
What's new?
Statistics
Data management
Graphics
Programming Stata
Mata
Resources
Internet capabilities
Stata for Windows
Stata for Unix
Stata for Mac
Technical support
Like us on Facebook Follow us on Twitter Follow us on LinkedIn Google+ Watch us on YouTube
Follow us
© Copyright 1996–2013 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index   |   View mobile site