Stata 11 help for tetrachoric

help tetrachoric dialog: tetrachoric -------------------------------------------------------------------------------

Title

[R] tetrachoric -- Tetrachoric correlations for binary variables

Syntax

tetrachoric varlist [if] [in] [weight] [, options]

options description ------------------------------------------------------------------------- Main stats(statlist) list of statistics; select up to 4 statistics; default is stats(rho) edwards use the noniterative Edwards and Edwards estimator; default is the maximum likelihood estimator print(#) significance level for displaying coefficients star(#) significance level for displaying with a star bonferroni use Bonferroni-adjusted significance level sidak use Sidak-adjusted significance level pw calculate all the pairwise correlation coefficients by using all available data (pairwise deletion) zeroadjust adjust frequencies when one cell has a zero count matrix display output in matrix form notable suppress display of correlations posdef modify correlation matrix to be positive semidefinite -------------------------------------------------------------------------

statlist description ------------------------------------------------------------------------- rho tetrachoric correlation coefficient se standard error of rho obs number of observations p exact two-sided significance level -------------------------------------------------------------------------

by is allowed; see [D] by. fweights are allowed; see weight.

Menu

Statistics > Summaries, tables, and tests > Summary and descriptive statistics > Tetrachoric correlations

Description

tetrachoric computes estimates of the tetrachoric correlation coefficients of the binary variables in varlist. All these variables should be 0, 1, or missing values.

Tetrachoric correlations assume a latent bivariate normal distribution (X1, X2) for each pair of variables (v1, v2), with a threshold model for the manifest variables (vi = 1 if and only if Xi > 0). The means and variances of the latent variables are not identified, but the correlation, r, of X1 and X2 can be estimated from the joint distribution of v1 and v2 and is called the tetrachoric correlation coefficient.

tetrachoric computes pairwise estimates of the tetrachoric correlations by the (iterative) maximum likelihood estimator obtained from bivariate probit without explanatory variables (see [R] biprobit) by using the Edwards and Edwards (1984) noniterative estimator as the initial value.

The pairwise correlation matrix is returned as r(Rho) and can be used to perform a factor analysis or a principal component analysis of binary variables by using the factormat or pcamat commands; see [MV] factor and [MV] pca.

Options

+------+ ----+ Main +-------------------------------------------------------------

stats(statlist) specifies the statistics to be displayed in the matrix of output. stats(rho) is the default. Up to four statistics may be specified. stats(rho se p obs) would display the tetrachoric correlation coefficient, its standard error, the significance level, and the number of observations. If varlist contains only two variables, all statistics are shown in tabular form. stats(), print(), and star() have no effect unless the matrix option is also specified.

edwards specifies that the noniterative Edwards and Edwards estimator be used. The default is the maximum likelihood estimator. If you analyze many binary variables, you may want to use the fast noniterative estimator proposed by Edwards and Edwards (1984). However, if you have skewed variables, the approximation does not perform well.

print(#) specifies the maximum significance level of correlation coefficients to be printed. Correlation coefficients with larger significance levels are left blank in the matrix. Typing tetrachoric ..., print(.10) would list only those correlation coefficients that are significant at the 10% level or lower.

star(#) specifies the maximum significance level of correlation coefficients to be marked with a star. Typing tetrachoric ..., star(.05) would "star" all correlation coefficients significant at the 5% level or lower.

bonferroni makes the Bonferroni adjustment to calculated significance levels. This option affects printed significance levels and the print() and star() options. Thus tetrachoric ..., print(.05) bonferroni prints coefficients with Bonferroni-adjusted significance levels of 0.05 or less.

sidak makes the Sidak adjustment to calculated significance levels. This option affects printed significance levels and the print() and star() options. Thus tetrachoric ..., print(.05) sidak prints coefficients with Sidak-adjusted significance levels of 0.05 or less.

pw specifies that the tetrachoric correlation be calculated by using all available data. By default, tetrachoric uses casewise deletion, where observations are ignored if any of the specified variables in varlist are missing.

zeroadjust specifies that when one of the cells has a zero count, a frequency adjustment be applied in such a way as to increase the zero to one-half and maintain row and column totals.

matrix forces tetrachoric to display the statistics as a matrix, even if varlist contains only two variables. matrix is implied if more than two variables are specified.

notable suppresses the output.

posdef modifies the correlation matrix so that it is positive semidefinite, i.e., a proper correlation matrix. The modified result is the correlation matrix associated with the least-squares approximation of the tetrachoric correlation matrix by a positive-semidefinite matrix. If the correlation matrix is modified, the standard errors and significance levels are not displayed and are not returned in r().

Examples

Setup . webuse familyvalues

Pearson correlations . correlate RS074 RS075 RS076

Correlations produced by tetrachoric . tetrachoric RS074 RS075 RS076

Pearson correlations . correlate RS056-RS063

Correlations produced by tetrachoric . tetrachoric RS056-RS063

Adjust correlation matrix, if need be, to be positive semidefinite . tetrachoric RS056-RS063 in 1/20, posdef

Saved results

tetrachoric saves the following in r():

Scalars r(rho) tetrachoric correlation coefficient between variables 1 and 2 r(N) number of observations r(nneg) number of negative eigenvalues (posdef only) r(se_rho) standard error of r(rho) r(p) exact two-sided significance level

Macros r(method) estimator used

Matrices r(Rho) tetrachoric correlation matrix r(Se_Rho) standard errors of r(Rho) r(corr) synonym for r(Rho) r(Nobs) number of observations used in computing correlation r(P) exact two-sided significance level matrix

Reference

Edwards, J. H., and A. W. F. Edwards. 1984. Approximating the tetrachoric correlation coefficient. Biometrics 40: 563.

Also see

Manual: [R] tetrachoric

Help: [R] biprobit, [R] correlate, [MV] factor, [R] spearman (ktau), [MV] pca, [R] tabulate twoway


© Copyright 1996–2009 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index