help ca postestimation dialogs: predict estat
cabiplot caprojection
screeplot
also see: ca
-------------------------------------------------------------------------------
Title
[MV] ca postestimation -- Postestimation tools for ca and camat
Description
The following postestimation commands are of special interest after ca
and camat:
command description
-------------------------------------------------------------------------
cabiplot biplot of row and column points
caprojection CA dimension projection plot
estat coordinates display row and column coordinates
estat distances display chi-squared distances between row and
column profiles
estat inertia display inertia contributions of the individual
cells
estat loadings display correlations of profiles and axes
("loadings")
estat profiles display row and column profiles
+ estat summarize estimation sample summary
estat table display fitted correspondence table
screeplot plot singular values
-------------------------------------------------------------------------
+ estat summarize is not available after camat.
The following standard postestimation commands are also available:
command description
-------------------------------------------------------------------------
* estimates cataloging estimation results
+ predict fitted values, row coordinates, or column
coordinates
-------------------------------------------------------------------------
* All estimates subcommands except table and stats are available.
+ predict is not available after camat.
Special-interest postestimation commands
cabiplot produces a plot of the row points or column points, or a biplot
of the row and column points. In this plot, the (Euclidean) distances
between row (column) points approximates the chi-squared distances
between the associated row (column) profiles if the CA is properly
normalized. Similarly, the association between a row and column point is
approximated by the inner product of vectors from the origin to the
respective points (see [MV] ca).
caprojection produces a line plot of the row and column coordinates. The
goal of this graph is to show the ordering of row and column categories
on each principal dimension of the analysis. Each principal dimension is
represented by a vertical line; markers are plotted on the lines where
the row and column categories project onto the dimensions.
estat coordinates displays the row and column coordinates.
estat distances displays the chi-squared distances between the row
profiles and between the column profiles. Also, the chi-squared
distances between the row and column profiles to the respective centers
(marginal distributions) are displayed. Optionally, the fitted profiles
rather than the observed profiles are used.
estat inertia displays the inertia (chi2/N) contributions of the
individual cells.
estat loadings displays the correlations of the row and column profiles
and the axes, comparable to the loadings of principal component analysis.
estat profiles displays the row and column profiles; the row (column)
profile is the conditional distribution of the row (column) given the
column (row). This is equivalent to specifying the row and column
options with the tabulate command.
estat summarize displays summary information about the row and column
variables over the estimation sample.
estat table displays the fitted correspondence table. Optionally, the
observed "correspondence table" and the expected table under independence
are displayed.
Syntax for predict
predict [type] newvar [if] [in] [, statistic ]
statistic description
-------------------------------------------------------------------------
Main
fit fitted values; the default
rowscore(#) row score for dimension #
colscore(#) column score for dimension #
-------------------------------------------------------------------------
predict is not available after camat.
Menu
Statistics > Postestimation > Predictions, residuals, etc.
Options for predict
+------+
----+ Main +-------------------------------------------------------------
fit specifies that fitted values for the correspondence analysis model be
computed. fit displays the fitted values p_{ij} according to the
correspondence analysis model. fit is the default.
rowscore(#) generates the row score for dimension #, i.e., the
appropriate elements from the normalized row coordinates.
colscore(#) generates the column score for dimension #, i.e., the
appropriate elements from the normalized column coordinates.
Syntax for estat
Display row and column coordinates
estat coordinates [, norow nocolumn format(%fmt) ]
Display chi-squared distances between row and column profiles
estat distances [, norow nocolumn approx format(%fmt) ]
Display inertia contributions of cells
estat inertia [, total noscale format(%fmt) ]
Display correlations of profiles and axes
estat loadings [, norow nocolumn format(%fmt) ]
Display row and column profiles
estat profiles [, norow nocolumn format(%fmt) ]
Display summary information
estat summarize [, labels noheader noweights ]
Display fitted correspondence table
estat table [, fit obs independence noscale format(%fmt) ]
options description
-------------------------------------------------------------------------
norow suppress display of row results
nocolumn suppress display of column results
format(%fmt) display format; default is format(%9.4f)
approx display distances between fitted (approximated)
profiles
total add row and column margins
noscale display chi-squared contributions; default is inertias
= chi2/N (with estat inertia)
labels display variable labels
noheader suppress the header
noweights ignore weights
fit display fitted values from correspondence analysis
model
obs display correspondence table ("observed table")
independence display expected values under independence
noscale suppress scaling of entries to 1 (with estat table)
-------------------------------------------------------------------------
Menu
Statistics > Postestimation > Reports and statistics
Options for estat
norow, an option used with estat coordinates, estat distances, and estat
profiles, suppresses the display of row results.
nocolumn, an option used with estat coordinates, estat distances, and
estat profiles, suppresses the display of column results.
format(%fmt), an option used with many of the subcommands of estat,
specifies the display format for the matrix, e.g., format(%8.3f).
The default is format(%9.4f).
approx, an option used with estat distances, computes distances between
the fitted profiles. The default is to compute distances between the
observed profiles.
total, an option used with estat inertia, adds row and column margins to
the table of inertia or chi-squared (chi-squared/N) contributions.
noscale, as an option used with estat inertia, displays chi-squared
contributions rather than inertia (= chi-squared/N) contributions.
(See below for the description of noscale with estat table.)
labels, an option used with estat summarize, displays variable labels.
noheader, an option used with estat summarize, suppresses the header.
noweights, an option used with estat summarize, ignores the weights, if
any. The default when weights are present is to perform a weighted
summarize on all variables except the weight variable itself. An
unweighted summarize is performed on the weight variable.
fit, an option used with estat table, displays the fitted values for the
correspondence analysis model. fit is implied if obs and
independence are not specified.
obs, an option used with estat table, displays the observed table with
nonnegative entries (the "correspondence table").
independence, an option used with estat table, displays the expected
values p(ij) assuming independence of the rows and columns, p(ij) =
r(i) c(j), where r(i) is the mass of row i and c(j) is the mass of
column j.
noscale, as an option used with estat table, normalizes the displayed
tables to the sum of the original table entries. The default is to
scale the tables to overall sum 1. (See above for the description of
noscale with estat inertia.)
Syntax for cabiplot
cabiplot [, options ]
options description
-------------------------------------------------------------------------
Main
dim(# #) two dimensions to be displayed;
default dim(2 1)
norow suppress row coordinates
nocolumn suppress column coordinates
xnegate negate the data relative to the x axis
ynegate negate the data relative to the y axis
maxlength(#) maximum number of characters for labels;
default is maxlength(12)
origin display the origin on the plot
originlopts(line_options) affect rendition of origin axes
Rows
rowopts(row_opts) affect rendition of rows
Columns
colopts(col_opts) affect rendition of columns
Y axis, X axis, Titles, Legend, Overall
twoway_options any options other than by() documented in
[G] twoway_options
-------------------------------------------------------------------------
row_opts and col_opts descriptions
-------------------------------------------------------------------------
plot_options change look of markers (color, size, etc.)
and look or position of marker labels
suppopts(plot_options) change look of supplementary markers and look
or position of supplementary marker labels
-------------------------------------------------------------------------
plot_options description
-------------------------------------------------------------------------
marker_options change look of markers (color, size, etc.)
marker_label_options add marker labels; change look or position
-------------------------------------------------------------------------
Menu
Statistics > Multivariate analysis > Correspondence analysis >
Postestimation after CA > Biplot of row and column points
Options for cabiplot
+------+
----+ Main +-------------------------------------------------------------
dim(# #) identifies the dimensions to be displayed. For instance,
dim(3 2) plots the third dimension (vertically) versus the second
dimension (horizontally). The dimension number cannot exceed the
number of extracted dimensions. The default is dim(2 1).
norow suppresses plotting of row points.
nocolumn suppresses plotting of column points.
xnegate specifies that the x-axis values are to be negated (multiplied by
-1).
ynegate specifies that the y-axis values are to be negated (multiplied by
-1).
maxlength(#) specifies the maximum number of characters for row and
column labels; the default is maxlength(12).
origin specifies that the origin be displayed on the plot. This is
equivalent to adding the options xline(0, lcolor(black)
lwidth(vthin)) yline(0, lcolor(black) lwidth(vthin)) to the cabiplot
command.
originlopts(line_options) affects the rendition of the origin axes; see
[G] line_options.
+------+
----+ Rows +-------------------------------------------------------------
rowopts(row_opts) affects the rendition of the rows. The following
row_opts are allowed:
plot_options affect the rendition of row markers, including their
shape, size, color, and outline (see [G] marker_options) and
specify if and how the row markers are to be labeled (see [G]
marker_label_options).
suppopts(plot_options) affects supplementary markers and
supplementary marker lables; see above for description of
plot_options.
+---------+
----+ Columns +----------------------------------------------------------
colopts(col_opts) affects the rendition of the columns. The following
col_opts are allowed:
plot_options affect the rendition of column markers, including their
shape, size, color, and outline (see [G] marker_options) and
specify if and how the column markers are to be labeled (see [G]
marker_label_options).
suppopts(plot_options) affects supplementary markers and
supplementary marker lables; see above for description of
plot_options.
+-----------------------------------------+
----+ Y axis, X axis, Titles, Legend, Overall +--------------------------
twoway_options are any of the options documented in [G] twoway_options
excluding by(). These include options for titling the graph (see [G]
title_options) and saving the graph to disk (see [G] saving_option).
See the remarks below for a warning against using options, such as
xlabel(), xscale(), ylabel(), and yscale().
cabiplot automatically adjusts the aspect ratio on the basis of the
range of the data and ensures that the axes are balanced. As an
alternative, the twoway_option aspectratio() can be used to override
the default aspect ratio. cabiplot accepts the aspectratio() option
as a suggestion only, and will override it when necessary to produce
plots with balanced axes; i.e., distance on the x axis equals
distance on the y axis.
twoway_options such as xlabel(), xscale(), ylabel(), and yscale()
should be used with caution. These options are accepted but may have
unintended side effects on the aspect ratio.
Syntax for caprojection
caprojection [, options ]
options description
-------------------------------------------------------------------------
Main
dim(numlist) dimensions to be displayed; default is all
norow suppress row coordinates
nocolumn suppress column coordinates
alternate alternate labels
maxlength(#) maximum number of characters displayed for
labels; default is maxlength(12)
combine_options affect the rendition of the combined column and
row graphs
Rows
rowopts(row_opts) affect rendition of rows
Columns
colopts(col_opts) affect rendition of columns
Y axis, X axis, Titles, Legend, Overall
twoway_options any options other than by() documented in
[G] twoway_options
-------------------------------------------------------------------------
row_opts and col_opts descriptions
-------------------------------------------------------------------------
plot_options change look of markers (color, size, etc.) and
look or position of marker labels
suppopts(plot_options) change look of supplementary markers and look
or position of supplementary marker labels
-------------------------------------------------------------------------
plot_options description
-------------------------------------------------------------------------
marker_options change look of markers (color, size, etc.)
marker_label_options add marker labels; change look or position
-------------------------------------------------------------------------
Menu
Statistics > Multivariate analysis > Correspondence analysis >
Postestimation after CA > Dimension projection plot
Options for caprojection
+------+
----+ Main +-------------------------------------------------------------
dim(numlist) identifies the dimensions to be displayed. By default, all
dimensions are displayed.
norow suppresses plotting of rows.
nocolumn suppresses plotting of columns.
alternate causes adjacent labels to alternate sides.
maxlength(#) specifies the maximum number of characters for row and
column labels; the default is maxlength(12).
combine_options affect the rendition of the combined plot; see [G] graph
combine. combine_options may not be specified with either norow or
nocolumn.
+------+
----+ Rows +-------------------------------------------------------------
rowopts(row_opts) affects the rendition of the rows. The following
row_opts are allowed:
plot_options affect the rendition of row markers, including their
shape, size, color, and outline (see [G] marker_options) and
specify if and how the row markers are to be labeled (see [G]
marker_label_options).
suppopts(plot_options) affects supplementary markers and
supplementary marker lables; see above for description of
plot_options.
+---------+
----+ Columns +----------------------------------------------------------
colopts(col_opts) affects the rendition of the columns. The following
col_opts are allowed:
plot_options affect the rendition of column markers, including their
shape, size, color, and outline (see [G] marker_options) and
specify if and how the column markers are to be labeled (see [G]
marker_label_options).
suppopts(plot_options) affects supplementary markers and
supplementary marker lables; see above for description of
plot_options.
+-----------------------------------------+
----+ Y axis, X axis, Titles, Legend, Overall +--------------------------
twoway_options are any of the options documented in [G] twoway_options
excluding by(). These include options for titling the graph (see [G]
title_options) and for saving the graph to disk (see [G]
saving_option).
Examples
Setup
. webuse ca_smoking
Estimate CA
. ca rank smoking
Postestimation statistics
. estat distances
. estat distances, fit
. estat inertia
. estat inertia, total noscale
. estat profiles, nocolumn
. estat table, fit obs
Biplots
. cabiplot
. cabiplot, nocolumn
Dimension projection plot
. caprojection, dim(1/2)
Predict variables
. predict fitted, fit
. predict pers_score, rowscore(1)
. predict smok_score, colscore(1)
Saved results
estat distances saves the following in r():
Matrices
r(Dcolumns) chi-squared distances between the columns and
between the columns and the column center
r(Drows) chi-squared distances between the rows and between
the rows and the row center
estat inertia saves the following in r():
Matrices
r(Q) matrix of (squared) inertia (or chi-squared)
contributions
estat loadings saves the following in r():
Matrices
r(LC) column loadings
r(LR) row loadings
estat profiles saves the following in r():
Matrices
r(Pcolumns) column profiles (columns normalized to 1)
r(Prows) row profiles (rows normalized to 1)
estat table saves the following in r():
Matrices
r(Fit) fitted (reconstructed) values
r(Fit0) fitted (reconstructed) values, assuming
independence of row and column variables
r(Obs) correspondence table
Also see
Manual: [MV] ca postestimation
Help: [MV] ca;
[MV] screeplot