Stata 11 help for ca postestimation

help ca postestimation dialogs: predict estat cabiplot caprojection screeplot also see: ca -------------------------------------------------------------------------------

Title

[MV] ca postestimation -- Postestimation tools for ca and camat

Description

The following postestimation commands are of special interest after ca and camat:

command description ------------------------------------------------------------------------- cabiplot biplot of row and column points caprojection CA dimension projection plot estat coordinates display row and column coordinates estat distances display chi-squared distances between row and column profiles estat inertia display inertia contributions of the individual cells estat loadings display correlations of profiles and axes ("loadings") estat profiles display row and column profiles + estat summarize estimation sample summary estat table display fitted correspondence table screeplot plot singular values ------------------------------------------------------------------------- + estat summarize is not available after camat.

The following standard postestimation commands are also available:

command description ------------------------------------------------------------------------- * estimates cataloging estimation results + predict fitted values, row coordinates, or column coordinates ------------------------------------------------------------------------- * All estimates subcommands except table and stats are available. + predict is not available after camat.

Special-interest postestimation commands

cabiplot produces a plot of the row points or column points, or a biplot of the row and column points. In this plot, the (Euclidean) distances between row (column) points approximates the chi-squared distances between the associated row (column) profiles if the CA is properly normalized. Similarly, the association between a row and column point is approximated by the inner product of vectors from the origin to the respective points (see [MV] ca).

caprojection produces a line plot of the row and column coordinates. The goal of this graph is to show the ordering of row and column categories on each principal dimension of the analysis. Each principal dimension is represented by a vertical line; markers are plotted on the lines where the row and column categories project onto the dimensions.

estat coordinates displays the row and column coordinates.

estat distances displays the chi-squared distances between the row profiles and between the column profiles. Also, the chi-squared distances between the row and column profiles to the respective centers (marginal distributions) are displayed. Optionally, the fitted profiles rather than the observed profiles are used.

estat inertia displays the inertia (chi2/N) contributions of the individual cells.

estat loadings displays the correlations of the row and column profiles and the axes, comparable to the loadings of principal component analysis.

estat profiles displays the row and column profiles; the row (column) profile is the conditional distribution of the row (column) given the column (row). This is equivalent to specifying the row and column options with the tabulate command.

estat summarize displays summary information about the row and column variables over the estimation sample.

estat table displays the fitted correspondence table. Optionally, the observed "correspondence table" and the expected table under independence are displayed.

Syntax for predict

predict [type] newvar [if] [in] [, statistic ]

statistic description ------------------------------------------------------------------------- Main fit fitted values; the default rowscore(#) row score for dimension # colscore(#) column score for dimension # ------------------------------------------------------------------------- predict is not available after camat.

Menu

Statistics > Postestimation > Predictions, residuals, etc.

Options for predict

+------+ ----+ Main +-------------------------------------------------------------

fit specifies that fitted values for the correspondence analysis model be computed. fit displays the fitted values p_{ij} according to the correspondence analysis model. fit is the default.

rowscore(#) generates the row score for dimension #, i.e., the appropriate elements from the normalized row coordinates.

colscore(#) generates the column score for dimension #, i.e., the appropriate elements from the normalized column coordinates.

Syntax for estat

Display row and column coordinates

estat coordinates [, norow nocolumn format(%fmt) ]

Display chi-squared distances between row and column profiles

estat distances [, norow nocolumn approx format(%fmt) ]

Display inertia contributions of cells

estat inertia [, total noscale format(%fmt) ]

Display correlations of profiles and axes

estat loadings [, norow nocolumn format(%fmt) ]

Display row and column profiles

estat profiles [, norow nocolumn format(%fmt) ]

Display summary information

estat summarize [, labels noheader noweights ]

Display fitted correspondence table

estat table [, fit obs independence noscale format(%fmt) ]

options description ------------------------------------------------------------------------- norow suppress display of row results nocolumn suppress display of column results format(%fmt) display format; default is format(%9.4f) approx display distances between fitted (approximated) profiles total add row and column margins noscale display chi-squared contributions; default is inertias = chi2/N (with estat inertia) labels display variable labels noheader suppress the header noweights ignore weights fit display fitted values from correspondence analysis model obs display correspondence table ("observed table") independence display expected values under independence noscale suppress scaling of entries to 1 (with estat table) -------------------------------------------------------------------------

Menu

Statistics > Postestimation > Reports and statistics

Options for estat

norow, an option used with estat coordinates, estat distances, and estat profiles, suppresses the display of row results.

nocolumn, an option used with estat coordinates, estat distances, and estat profiles, suppresses the display of column results.

format(%fmt), an option used with many of the subcommands of estat, specifies the display format for the matrix, e.g., format(%8.3f). The default is format(%9.4f).

approx, an option used with estat distances, computes distances between the fitted profiles. The default is to compute distances between the observed profiles.

total, an option used with estat inertia, adds row and column margins to the table of inertia or chi-squared (chi-squared/N) contributions.

noscale, as an option used with estat inertia, displays chi-squared contributions rather than inertia (= chi-squared/N) contributions. (See below for the description of noscale with estat table.)

labels, an option used with estat summarize, displays variable labels.

noheader, an option used with estat summarize, suppresses the header.

noweights, an option used with estat summarize, ignores the weights, if any. The default when weights are present is to perform a weighted summarize on all variables except the weight variable itself. An unweighted summarize is performed on the weight variable.

fit, an option used with estat table, displays the fitted values for the correspondence analysis model. fit is implied if obs and independence are not specified.

obs, an option used with estat table, displays the observed table with nonnegative entries (the "correspondence table").

independence, an option used with estat table, displays the expected values p(ij) assuming independence of the rows and columns, p(ij) = r(i) c(j), where r(i) is the mass of row i and c(j) is the mass of column j.

noscale, as an option used with estat table, normalizes the displayed tables to the sum of the original table entries. The default is to scale the tables to overall sum 1. (See above for the description of noscale with estat inertia.)

Syntax for cabiplot

cabiplot [, options ]

options description ------------------------------------------------------------------------- Main dim(# #) two dimensions to be displayed; default dim(2 1) norow suppress row coordinates nocolumn suppress column coordinates xnegate negate the data relative to the x axis ynegate negate the data relative to the y axis maxlength(#) maximum number of characters for labels; default is maxlength(12) origin display the origin on the plot originlopts(line_options) affect rendition of origin axes

Rows rowopts(row_opts) affect rendition of rows

Columns colopts(col_opts) affect rendition of columns

Y axis, X axis, Titles, Legend, Overall twoway_options any options other than by() documented in [G] twoway_options -------------------------------------------------------------------------

row_opts and col_opts descriptions ------------------------------------------------------------------------- plot_options change look of markers (color, size, etc.) and look or position of marker labels suppopts(plot_options) change look of supplementary markers and look or position of supplementary marker labels -------------------------------------------------------------------------

plot_options description ------------------------------------------------------------------------- marker_options change look of markers (color, size, etc.) marker_label_options add marker labels; change look or position -------------------------------------------------------------------------

Menu

Statistics > Multivariate analysis > Correspondence analysis > Postestimation after CA > Biplot of row and column points

Options for cabiplot

+------+ ----+ Main +-------------------------------------------------------------

dim(# #) identifies the dimensions to be displayed. For instance, dim(3 2) plots the third dimension (vertically) versus the second dimension (horizontally). The dimension number cannot exceed the number of extracted dimensions. The default is dim(2 1).

norow suppresses plotting of row points.

nocolumn suppresses plotting of column points.

xnegate specifies that the x-axis values are to be negated (multiplied by -1).

ynegate specifies that the y-axis values are to be negated (multiplied by -1).

maxlength(#) specifies the maximum number of characters for row and column labels; the default is maxlength(12).

origin specifies that the origin be displayed on the plot. This is equivalent to adding the options xline(0, lcolor(black) lwidth(vthin)) yline(0, lcolor(black) lwidth(vthin)) to the cabiplot command.

originlopts(line_options) affects the rendition of the origin axes; see [G] line_options.

+------+ ----+ Rows +-------------------------------------------------------------

rowopts(row_opts) affects the rendition of the rows. The following row_opts are allowed:

plot_options affect the rendition of row markers, including their shape, size, color, and outline (see [G] marker_options) and specify if and how the row markers are to be labeled (see [G] marker_label_options).

suppopts(plot_options) affects supplementary markers and supplementary marker lables; see above for description of plot_options.

+---------+ ----+ Columns +----------------------------------------------------------

colopts(col_opts) affects the rendition of the columns. The following col_opts are allowed:

plot_options affect the rendition of column markers, including their shape, size, color, and outline (see [G] marker_options) and specify if and how the column markers are to be labeled (see [G] marker_label_options).

suppopts(plot_options) affects supplementary markers and supplementary marker lables; see above for description of plot_options.

+-----------------------------------------+ ----+ Y axis, X axis, Titles, Legend, Overall +--------------------------

twoway_options are any of the options documented in [G] twoway_options excluding by(). These include options for titling the graph (see [G] title_options) and saving the graph to disk (see [G] saving_option). See the remarks below for a warning against using options, such as xlabel(), xscale(), ylabel(), and yscale().

cabiplot automatically adjusts the aspect ratio on the basis of the range of the data and ensures that the axes are balanced. As an alternative, the twoway_option aspectratio() can be used to override the default aspect ratio. cabiplot accepts the aspectratio() option as a suggestion only, and will override it when necessary to produce plots with balanced axes; i.e., distance on the x axis equals distance on the y axis.

twoway_options such as xlabel(), xscale(), ylabel(), and yscale() should be used with caution. These options are accepted but may have unintended side effects on the aspect ratio.

Syntax for caprojection

caprojection [, options ]

options description ------------------------------------------------------------------------- Main dim(numlist) dimensions to be displayed; default is all norow suppress row coordinates nocolumn suppress column coordinates alternate alternate labels maxlength(#) maximum number of characters displayed for labels; default is maxlength(12) combine_options affect the rendition of the combined column and row graphs Rows rowopts(row_opts) affect rendition of rows

Columns colopts(col_opts) affect rendition of columns

Y axis, X axis, Titles, Legend, Overall twoway_options any options other than by() documented in [G] twoway_options -------------------------------------------------------------------------

row_opts and col_opts descriptions ------------------------------------------------------------------------- plot_options change look of markers (color, size, etc.) and look or position of marker labels suppopts(plot_options) change look of supplementary markers and look or position of supplementary marker labels -------------------------------------------------------------------------

plot_options description ------------------------------------------------------------------------- marker_options change look of markers (color, size, etc.) marker_label_options add marker labels; change look or position -------------------------------------------------------------------------

Menu

Statistics > Multivariate analysis > Correspondence analysis > Postestimation after CA > Dimension projection plot

Options for caprojection

+------+ ----+ Main +-------------------------------------------------------------

dim(numlist) identifies the dimensions to be displayed. By default, all dimensions are displayed.

norow suppresses plotting of rows.

nocolumn suppresses plotting of columns.

alternate causes adjacent labels to alternate sides.

maxlength(#) specifies the maximum number of characters for row and column labels; the default is maxlength(12).

combine_options affect the rendition of the combined plot; see [G] graph combine. combine_options may not be specified with either norow or nocolumn.

+------+ ----+ Rows +-------------------------------------------------------------

rowopts(row_opts) affects the rendition of the rows. The following row_opts are allowed:

plot_options affect the rendition of row markers, including their shape, size, color, and outline (see [G] marker_options) and specify if and how the row markers are to be labeled (see [G] marker_label_options).

suppopts(plot_options) affects supplementary markers and supplementary marker lables; see above for description of plot_options.

+---------+ ----+ Columns +----------------------------------------------------------

colopts(col_opts) affects the rendition of the columns. The following col_opts are allowed:

plot_options affect the rendition of column markers, including their shape, size, color, and outline (see [G] marker_options) and specify if and how the column markers are to be labeled (see [G] marker_label_options).

suppopts(plot_options) affects supplementary markers and supplementary marker lables; see above for description of plot_options.

+-----------------------------------------+ ----+ Y axis, X axis, Titles, Legend, Overall +--------------------------

twoway_options are any of the options documented in [G] twoway_options excluding by(). These include options for titling the graph (see [G] title_options) and for saving the graph to disk (see [G] saving_option).

Examples

Setup . webuse ca_smoking

Estimate CA . ca rank smoking

Postestimation statistics . estat distances . estat distances, fit . estat inertia . estat inertia, total noscale . estat profiles, nocolumn . estat table, fit obs

Biplots . cabiplot . cabiplot, nocolumn

Dimension projection plot . caprojection, dim(1/2)

Predict variables . predict fitted, fit . predict pers_score, rowscore(1) . predict smok_score, colscore(1)

Saved results

estat distances saves the following in r():

Matrices r(Dcolumns) chi-squared distances between the columns and between the columns and the column center r(Drows) chi-squared distances between the rows and between the rows and the row center

estat inertia saves the following in r():

Matrices r(Q) matrix of (squared) inertia (or chi-squared) contributions

estat loadings saves the following in r():

Matrices r(LC) column loadings r(LR) row loadings

estat profiles saves the following in r():

Matrices r(Pcolumns) column profiles (columns normalized to 1) r(Prows) row profiles (rows normalized to 1)

estat table saves the following in r():

Matrices r(Fit) fitted (reconstructed) values r(Fit0) fitted (reconstructed) values, assuming independence of row and column variables r(Obs) correspondence table

Also see

Manual: [MV] ca postestimation

Help: [MV] ca; [MV] screeplot


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index