# st: How to simulate data from ordered probit --abnormality of results?

 From Nblunch@worldbank.org To statalist@hsphsun2.harvard.edu Subject st: How to simulate data from ordered probit --abnormality of results? Date Sun, 30 May 2004 16:03:12 -0400

```

Dear Statalisters,

As part of a larger simulation exercise, I am starting out with simple probits
and ordered probits (I have never tried this before and therefore wanted to get
the basics right before moving on to something more complicated).

While the results for the ordered probit are largely consistent with the data
process that I specified in the sense that the estimated parameters (except for
one, namely the one for X1 below) are included in the 95 percent confidence
interval, I still thought these a bit "off" compared to what I have expected.  I
realize that due to the (pseudo) randomness of the variables, the results would
likely deviate a bit.

As a newbie in this arena I wonder if these results are "normal" -- or am I
doing something wrong?  For example, the estimated standard deviations on the
cut points are much larger than the values I specified and, again, while the
estimated parameters mostly fall within the 95 percent confidence interval, they
do seem a bit "off" compared to the values I specified...?  I realize that the
fit is not that good, maybe that is part of the problem?  If so, how do I go
about ensuring a good (but not too good, since then observations will drop
out!!) fit of the regression?

Your help and insights will be greatly appreciated -- Thanks!!

Cheers,

Niels-Hugo

Here follows the relevant parts of the log-file:

. /* Create the residual (u) and the X's: */
.
. matrix m =   (0, 31.4, 10.4, 6.85, 10.8, 8.5, 6.4)

. matrix sdm = (1, 7.3, 4.7, 1.13, 2.4, 0.2, 0.3)

. drawnorm u X1 X2 X3 X4 X5 X6, n(3000) seed(19712004) means(m) sds(sdm)
(obs 3000)

. /* Create the cut points: */
.
. matrix cp = (0.77, 1.29)

. matrix sdcp = (0.1, 0.2)

. drawnorm Cut1 Cut2, n(3000) seed(19712004) means(cp) sds(sdcp)

. summarize

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
u |      3000   -.0027312    1.006306  -3.117829     3.4684
X1 |      3000    31.18638    7.139536   5.131144   55.13468
X2 |      3000    10.37326    4.799704  -7.207477   29.47523
X3 |      3000    6.857711    1.119025   2.623326    11.1235
X4 |      3000    10.82347    2.468898   1.783469    18.8689
-------------+--------------------------------------------------------
X5 |      3000    8.505496    .1961301   7.764948   9.130346
X6 |      3000    6.387869    .3051463    5.26484   7.604205
Cut1 |      3000    .7697269    .1006306   .4582171    1.11684
Cut2 |      3000    1.284147    .1956037   .5703053   1.940265

.
. /* Specify the regression parameters: */
.
. matrix betas = (-.005, -.04, .14, 0.03, 0.01, 0.07)

. matrix colnames betas = X1 X2 X3 X4 X5 X6

.
. /* Generate the dependent variable: */
.
. matrix score z = betas

. gen y = 0 if z + u <= Cut1
(2082 missing values generated)

. replace y = 1 if z + u > Cut1 & z + u <= Cut2

. replace y = 2 if z + u > Cut2

.
. tab y

y |      Freq.     Percent        Cum.
------------+-----------------------------------
0 |        918       30.60       30.60
1 |        652       21.73       52.33
2 |      1,430       47.67      100.00
------------+-----------------------------------
Total |      3,000      100.00

.
. oprobit y X1 X2 X3 X4 X5 X6

Iteration 0:   log likelihood = -3141.7719
Iteration 1:   log likelihood = -3036.6214
Iteration 2:   log likelihood =  -3036.493
Iteration 3:   log likelihood =  -3036.493

Ordered probit estimates                          Number of obs   =       3000
LR chi2(6)      =     210.56
Prob > chi2     =     0.0000
Log likelihood =  -3036.493                       Pseudo R2       =     0.0335

------------------------------------------------------------------------------
y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
X1 |  -.0202611   .0030312    -6.68   0.000     -.026202   -.0143201
X2 |  -.0461186   .0045333   -10.17   0.000    -.0550036   -.0372336
X3 |   .1245861   .0192731     6.46   0.000     .0868115    .1623606
X4 |   .0362435   .0086893     4.17   0.000     .0192128    .0532741
X5 |  -.1937575    .108436    -1.79   0.074    -.4062881    .0187731
X6 |   .0422267   .0701143     0.60   0.547    -.0951948    .1796481
-------------+----------------------------------------------------------------
_cut1 |  -1.776959   1.053154          (Ancillary parameters)
_cut2 |  -1.184145   1.053026
------------------------------------------------------------------------------

.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```