**[R] rologit** -- Rank-ordered logistic regression

__Syntax__

**rologit** *depvar* *indepvars* [*if*] [*in*] [*weight*]**,** __gr__**oup(***varname***)** [*options*]

*options* Description
-------------------------------------------------------------------------
Model
* __gr__**oup(***varname***)** identifier variable that links the alternatives
__off__**set(***varname***)** include *varname* in model with coefficient
constrained to 1
__inc__**omplete(***#***)** use *#* to code unranked alternatives; default is
**incomplete(0)**
__rev__**erse** reverse the preference order
__note__**strhs** keep right-hand-side variables that do not vary
within group
**ties(***spec***)** method to handle ties: **exactm**, **breslow**, **efron**, or
**none**

SE/Robust
**vce(***vcetype***)** *vcetype* may be **oim**, __r__**obust**, __cl__**uster** *clustvar*,
__boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
*display_options* control columns and column formats, row spacing,
line width, display of omitted variables and base
and empty cells, and factor-variable labeling

Maximization
*maximize_options* control the maximization process; seldom used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
***group(***varname***)** is required.
*indepvars* may contain factor variables; see fvvarlist.
**bootstrap**, **by**, **fp**, **jackknife**, **rolling**, and **statsby** are allowed; see
prefix.
Weights are not allowed with the **bootstrap** prefix.
**fweight**s, **iweight**s, and **pweight**s are allowed, except with **ties(efron)**;
see weight.
**coeflegend** does not appear in the dialog box.
See **[R] rologit postestimation** for features available after estimation.

__Menu__

**Statistics > Ordinal outcomes > Rank-ordered logistic regression**

__Description__

**rologit** fits the rank-ordered logistic regression model by maximum
likelihood (Beggs, Cardell, and Hausman 1981). This model is also known
as the Plackett-Luce model (Marden 1995), as the exploded logit model
(Punj and Staelin 1978), and as the choice-based method of conjoint
analysis (Hair et al. 2010).

**rologit** expects the data to be in long form, similar to **clogit**, in which
each of the ranked alternatives forms an observation; all observations
related to an individual are linked together by the variable that you
specify in the **group()** option. The distinction from **clogit** is that
*depvar* in **rologit** records the rankings of the alternatives, whereas for
**clogit**, *depvar* marks only the best alternative by a value not equal to
zero. **rologit** interprets equal scores of *depvar* as ties. The ranking
information may be incomplete "at the bottom" (least preferred
alternatives). That is, unranked alternatives may be coded as 0 or as a
common value that may be specified with the **incomplete()** option.

If your data record only the unique alternative, **rologit** fits the same
model as **clogit**.

__Options__

+-------+
----+ Model +------------------------------------------------------------

**group(***varname***)** is required, and it specifies the identifier variable
(numeric or string) that links the alternatives for an individual,
which have been compared and rank ordered with respect to one
another.

**offset(***varname***)**; see **[R] estimation options**.

**incomplete(***#***)** specifies the numeric value used to code alternatives that
are not ranked. It is assumed that unranked alternatives are less
preferred than the ranked alternatives (that is, the data record the
ranking of the most preferred alternatives). It is not assumed that
subjects are indifferent between the unranked alternatives. *#*
defaults to 0.

**reverse** specifies that in the preference order, a higher number means a
less attractive alternative. The default is that higher values
indicate more attractive alternatives. The rank-ordered logit model
is not symmetric in the sense that reversing the ordering simply
leads to a change in the signs of the coefficients.

**notestrhs** suppresses the test that the independent variables vary within
(at least some of) the groups. Effects of variables that are always
constant are not identified. For instance, a rater's gender cannot
directly affect his or her rankings; it could affect the rankings
only via an interaction with a variable that does vary over
alternatives.

**ties(***spec***)** specifies the method for handling ties (indifference between
alternatives) (see **[ST] stcox** for details):

__ex__**actm** exact marginal likelihood (default)
__bre__**slow** Breslow's method (default if **pweight**s specified)
__efr__**on** Efron's method (default if robust VCE)
**none** no ties allowed

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**oim**), that
are robust to some kinds of misspecification (**robust**), that allow for
intragroup correlation (**cluster** *clustvar*), and that use bootstrap or
jackknife methods (**bootstrap**, **jackknife**); see **[R] ***vce_option*.

If **ties(exactm)** is specified, *vcetype* may be only **oim**, **bootstrap**, or
**jackknife**.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Maximization +-----------------------------------------------------

*maximize_options*: __iter__**ate(***#***)**, __tr__**ace**, [__no__]__lo__**g**, __tol__**erance(***#***)**,
__ltol__**erance(***#***)**, __nrtol__**erance(***#***)**, and __nonrtol__**erance**; see **[R] maximize**.
These options are seldom used.

The following option is available with **rologit** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Examples__

You have data in which subjects ranked up to four options. **rologit**
requires that the data be in "long format", in which the responses of one
subject are recorded in different records (observations).

caseid depvar option x1 x2 male
1 4 1 1 0 0
1 2 2 0 1 0
1 3 3 0 0 0
1 1 4 1 1 0

2 1 1 3 0 0
2 3 2 0 1 0
2 3 3 2 1 0
2 4 4 1 2 0

3 1 1 3 1 1
3 3 2 1 1 1
3 4 4 0 1 1

4 2 1 1 1 1
4 1 2 1 1 1
4 0 3 0 1 1
4 0 4 1 0 1

where 0 indicates that subject 4 only specified his two most favorable
alternatives. In this example

subject 1 has ranking

option_1 > option_3 > option_2 > option_4

subject 2 has a ranking with ties,

option_4 > option_2 == option_3 > option_1

subject 3 ranked a subset of alternatives, ignoring option 3,

option_4 > option_2 > option_1

subject 4 had an incomplete ranking

option_1 > option_2 > (option_3,option_4)

Subject 4 ranked option_1 highest among all four options, and ranked
option_2 highest among the remaining three options. His preference
ordering among option_3 and option_4, however, is not known.

**. webuse rologitxmpl2**

You can fit a rank-ordered logit model for the four alternatives as

**. rologit depvar x1 x2, group(caseid)**

More complicated models may be formulated as well. We can perform a
likelihood-ratio test that men and women rank the options in the same way
(note that the main effect of gender is not identified),

**. estimates store base**
**. rologit depvar x1 x2 male#c.x1 male#c.x2, group(caseid)**
**. estimates store full**
**. lrtest base full**

__A note on data organization__

Sometimes your data will be in a "wide format" in which the ranking of
options are described in a series of variables, rather than in different
observations that are associated with one subject.

caseid opt1 opt2 opt3 opt4
1 4 2 3 1
2 1 3 3 4
3 1 3 . 4
4 2 1 0 0

You may want to verify that this information is identical to the data in
long format listed above. The Stata command **reshape** makes the
transformation between "long" and "wide" formats quite simple,

**. reshape long opt, i(caseid) j(option)**
**. drop if missing(opt)**

__Stored results__

**rologit** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(ll_0)** log likelihood of the null model ("all rankings are
equiprobable")
**e(ll)** log likelihood
**e(df_m)** model degrees of freedom
**e(chi2)** chi-squared
**e(p)** p-value for model test
**e(r2_p)** pseudo-R^2
**e(N_g)** number of groups
**e(g_min)** minimum group size
**e(g_avg)** average group size
**e(g_max)** maximum group size
**e(code_inc)** value for incomplete preferences
**e(N_clust)** number of clusters
**e(rank)** rank of **e(V)**
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **rologit**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(group)** name of **group()** variable
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(offset)** linear offset variable
**e(chi2type)** **Wald** or **LR**; type of model chi-squared test
**e(reverse)** **reverse**, if specified
**e(ties)** **breslow**, **efron**, **exactm**
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(marginsok)** predictions allowed by **margins**
**e(marginsnotok)** predictions disallowed by **margins**
**e(marginsdefault)** default **predict()** specification for **margins**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample

__References__

Beggs, S., S. Cardell, and J. A. Hausman. 1981. Assessing the potential
demand for electric cars. *Journal of Econometrics* 17: 1-19.

Hair, J. F., Jr., W. C. Black, and B. J. Babin, and R. E. Anderson. 2010.
*Multivariate Data Analysis*. 7th ed. Upper Saddle River, NJ: Pearson.

Marden, J. I. 1995. *Analyzing and Modeling Rank Data*. London: Chapman &
Hall.

Punj, G. N., and R. Staelin. 1978. The choice process for graduate
business schools. *Journal of Marketing Research* 15: 588-598.