Stata 15 help for gsem_command

[SEM] gsem -- Generalized structural equation model estimation command


gsem paths [if] [in] [weight] [, options]

where paths are the paths of the model in command-language path notation; see [SEM] sem and gsem path notation.

options Description ------------------------------------------------------------------------- model_description_options fully define, along with paths, the model to be fit

group_options fit model for different groups

lclass_options fit model with latent classes

estimation_options method used to obtain estimation results

reporting_options reporting of estimation results

syntax_options controlling interpretation of syntax ------------------------------------------------------------------------- Factor variables and time-series operators are allowed. bootstrap, by, jackknife, permute, statsby, and svy are allowed; see prefix. Weights are not allowed with the bootstrap prefix. vce() and weights are not allowed with the svy prefix. fweights, iweights, and pweights are allowed; see weight. Also see [SEM] gsem postestimation for features available after estimation.


Statistics > SEM (structural equation modeling) > Model building and estimation


gsem fits generalized SEMs. When you use the Builder in gsem mode, you are using the gsem command.


model_description_options describe the model to be fit. The model to be fit is fully specified by paths -- which appear immediately after gsem -- and the options covariance(), variance(), and means(). See [SEM] gsem model description options and [SEM] sem and gsem path notation.

group_options allow the specified model to be fit for different subgroups of the data, with some parameters free to vary across groups and other parameters constrained to be equal across groups. See [SEM] gsem group options.

lclass_options allow the specified model to be fit across a specified number of latent classes, with some parameters free to vary across classes and other parameters constrained to be equal across classes. See [SEM] gsem lclass options.

estimation_options control how the estimation results are obtained. These options control how the standard errors (VCE) are obtained and control technical issues such as choice of estimation method. See [SEM] gsem estimation options.

reporting_options control how the results of estimation are displayed. See [SEM] gsem reporting options.

syntax_options control how the syntax that you type is interpreted. See [SEM] sem and gsem syntax options.


gsem provides important features not provided by sem and correspondingly omits useful features provided by sem. The differences in capabilities are the following:

1. gsem allows generalized linear response functions as well as the linear response functions allowed by sem.

2. gsem allows for multilevel models, something sem does not.

3. gsem allows for categorical latent variables, which are not allowed by sem.

4. gsem allows Stata's factor-variable notation to be used in specifying models, something sem does not.

5. gsem's method ML is sometimes able to use more observations in the presence of missing values than can sem's method ML. Meanwhile, gsem does not provide the MLMV method provided by sem for explicitly handling missing values.

6. gsem cannot produce standardized coefficients.

7. gsem cannot use summary statistic datasets (SSDs); sem can.

gsem has nearly identical syntax to sem. Differences in syntax arise because of differences in capabilities. The resulting differences in syntax are the following:

1. gsem adds new syntax to paths to handle latent variables associated with multilevel modeling.

2. gsem adds new options to handle the family and link of generalized linear responses.

3. gsem adds new syntax to handle categorical latent variables.

4. gsem deletes options related to features it does not have, such as SSDs.

5. gsem adds technical options for controlling features not provided by sem, such as numerical integration (quadrature choices), number of integration points, and a number of options dealing with starting values, which are a more difficult proposition in the generalized SEM framework.

For a readable explanation of what gsem can do and how to use it, see the intro sections. You might start with [SEM] intro 1.

For examples of gsem in action, see the example sections. You might start with [SEM] example 1.

See the following advanced topics in [SEM] gsem:

Default normalization constraints Default covariance assumptions How to solve convergence problems


These examples are intended for quick reference. For detailed examples, see [SEM] examples.

Examples: Linear regression

Setup . sysuse auto

Use regress command . regress mpg weight c.weight#c.weight foreign

Replicate model with gsem . gsem (mpg <- weight c.weight#c.weight foreign)

Examples: Logistic regression

Setup . webuse gsem_lbw

Use logit command . logit low age lwt i.race smoke ptl ht ui

Replicate model with gsem . gsem (low <- age lwt i.race smoke ptl ht ui), logit

Examples: Poisson regression

Setup . webuse dollhill3

Use poisson command . poisson deaths smokes i.agecat, exposure(pyears)

Replicate model with gsem . gsem (deaths <- smokes i.agecat), poisson exposure(pyears)

Examples: Single-factor measurement model with binary outcomes

Setup . webuse gsem_1fmm

Binary responses modeled using Bernoulli family and probit link . gsem (x1 x2 x3 x4 <- X), probit

Examples: Full structural equation model with binary and ordinal measurements

Setup . webuse gsem_cfa

SEM with latent variable MathAb predicted by latent variable MathAtt . gsem (MathAb -> q1-q8, logit) (MathAtt -> att1-att5, ologit) (MathAtt -> MathAb)

Examples: Item Response Theory (IRT) models

Setup . webuse gsem_cfa

One-parameter logistic IRT model . gsem (MathAb -> (q1-q8)@b), logit var(MathAb@1)

Two-parameter logistic IRT model . gsem (MathAb -> q1-q8), logit var(MathAb@1)

Examples: Two-level measurement model with binary outcomes

Setup . webuse gsem_cfa

Model with latent variable Sch[school] at the school level and latent variable MathAb and the student nested in school level . gsem (MathAb M1[school] -> q1-q8), logit

Examples: Three-level negative binomial model

Setup . webuse gsem_melanoma

Model with random intercepts at the nation and the region nested in nation levels . gsem (deaths <- uv M1[nation] M2[nation>region]), nbreg exposure(expected)

Examples: Heckman selection model

Setup . webuse gsem_womenwk . generate selected = wage < .

Selection model for wage . gsem (wage <- educ age L) (selected <- married children educ age L@1, probit), var(L@1)

Examples: Latent class analysis

Setup . webuse gsem_lca1, clear

Model with two classes using logistic regression to model accident, play, insurance, and stock . gsem (accident play insurance stock <- ), logit lclass(C 2)

Stored results

gsem stores the following in e():

Scalars e(N) number of observations e(N_clust) number of clusters e(N_groups) number of groups e(k) number of parameters e(k_cat#) number of categories for the #th depvar, ordinal e(k_dv) number of dependent variables e(k_eq) number of equations in e(b) e(k_out#) number of outcomes for the #th depvar, mlogit e(k_rc) number of covariances e(k_rs) number of variances e(ll) log likelihood e(n_quad) number of integration points e(rank) rank of e(V) e(ic) number of iterations e(rc) return code e(converged) 1 if target model converged, 0 otherwise

Macros e(cmd) gsem e(cmdline) command as typed e(depvar) names of dependent variables e(eqnames) names of equations e(wtype) weight type e(wexp) weight expression e(fweightk) fweight variable for kth level, if specified e(pweightk) pweight variable for kth level, if specified e(iweightk) iweight variable for kth level, if specified e(title) title in estimation output e(clustvar) name of cluster variable e(family#) family for the #th depvar e(link#) link for the #th depvar e(offset#) offset for the #th depvar e(intmethod) integration method e(vce) vcetype specified in vce() e(vcetype) title used to label Std. Err. e(opt) type of optimization e(which) max or min; whether optimizer is to perform maximization or minimization e(method) estimation method: ml e(ml_method) type of ml method e(user) name of likelihood-evaluator program e(technique) maximization technique e(datasignature) the checksum e(datasignaturevars) variables used in calculation of checksum e(properties) b V e(estat_cmd) program used to implement estat e(predict) program used to implement predict e(covariates) list of covariates e(footnote) program used to implement the footnote display e(groupvar) name of group variable e(lclass) name of latent class variables e(asbalanced) factor variables fvset as asbalanced e(asobserved) factor variables fvset as asobserved e(marginsnotok) predictions not allowed by margins e(marginswtype) weight type for margins e(marginswexp) weight expression for margins e(marginsdefault) default predict() specification for margins

Matrices e(_N) sample size for each depvar e(b) parameter vector e(b_pclass) parameter class e(cat#) categories for the #th depvar, ordinal e(out#) outcomes for the #th depvar, mlogit e(Cns) constraints matrix e(ilog) iteration log (up to 20 iterations) e(gradient) gradient vector e(V) covariance matrix of the estimators e(V_modelbased) model-based variance e(nobs) vector with number of observations per group e(groupvalue) vector of group values of e(groupvar) e(lclass_k_levels) number of levels for latent class variables e(lclass_bases) base levels for latent class variables

Functions e(sample) marks estimation sample

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index