Survey support for gsem

Order

Watch video demo

<- See Stata's other features

Highlights

Support for survey data in generalized structural equation models

Structural equation models (SEMs) with binary, count, ordinal, and survival outcomes
Multilevel SEMs
That is, for all models fit by Stata's gsem

Point estimates and standard errors adjusted for survey design

Sampling weights
Primary and secondary sampling units (and tertiary, etc.)
Stratification
Finite-population corrections
Weights at each stage of a multistage design for multilevel models
Linearized, bootstrap, jackknife, or BRR (balanced and repeated replications) standard errors (SEs) for one-level models
Linearized SEs for multilevel models

When analyzing complex survey data, we take into account the characteristics of the survey design—clustering, stratification, sampling weights, and finite-population corrections. We adjust both point estimates and standard errors for the design characteristics when fitting our model, in this case, a structural equation model.

Say we have a sample of employees from a large department store chain who have responded to a series of questions related to job satisfaction. We want to fit a confirmatory factor analysis (CFA) model that measures job satisfaction based on these items. If our data were collected by first sampling stores and then sampling employees within stores, we could adjust the results of our CFA model for this design.

Perhaps we are interested in fitting a multilevel mediation model evaluating whether the impact of one-on-one tutoring influences the relationship between math test scores in third grade and math test scores in fourth grade. We also believe that school-level characteristics might impact test scores and include a school-level random intercept in the model. If the data were collected by first sampling schools and then sampling students within schools, we could adjust the results of our multilevel mediation model for this design.

Throughout Stata, analyzing complex survey data is as simple as using svyset to declare aspects of the survey design and then adding the svy: to the estimation command for the model you want to fit. We can now use svyset and svy: when fitting multilevel structural equation models and structural equation models with binary, count, ordinal, and survival-time outcomes.

Let's see it work

Suppose we are interested in measuring students' attitudes toward math. Our data contain five variables, att1–att5, recording students' responses to various statements about mathematics, such as "skills taught in my math class will help me get a better job" and "I am able to learn new math concepts easily". Responses are coded one through five, with one indicating strong disagreement and five indicating strong agreement with the statement. We want to fit a one-factor CFA model using an ordinal probit model for each response. We name our latent variable MathAtt. If we had random (i.i.d.) data, we could fit the model by typing

. gsem (MathAtt -> att1 att2 att3 att4 att5), oprobit

However, we want to take into account the complex survey design used to collect these data. Schools were sampled first. Then, students were sampled within the selected schools. We have a weight variable, finalwt, that represents the inverse of the probability that a student was included in the sample. We can declare our survey design by typing

. svyset school [pweight=finalwt]

Then, we simply add svy: to gsem:

. svy: gsem (MathAtt -> att1 att2 att3 att4 att5), oprobit
(running gsem on estimation sample)
(slanted)
(output omitted)

Survey: Generalized structural equation model

Number of strata =  1                                  Number of obs   =   200
Number of PSUs   = 20                                  Population size = 2,976
                                                       Design df       =    19
 ( 1)  [att1]MathAtt = 1

                           Linearized
               Coefficient  std. err.      t    P>|t|     [95% conf. interval]

att1          
     MathAtt            1  (constrained)

att2          
     MathAtt     .2400526   .1255965     1.91   0.071     -.022824    .5029291

att3          
     MathAtt     -1.25051   .5295546    -2.36   0.029    -2.358881   -.1421398

att4          
     MathAtt     -1.25051   .5295546    -2.36   0.029    -2.358881   -.1421398

att5          
     MathAtt     .2316945   .1016915     2.28   0.034     .0188517    .4445373

/att1          
        cut1    -.6778102   .1532773                     -.9986232   -.3569972
        cut2    -.2084441   .1088449                      -.436259    .0193708
        cut3     .1496295   .1197963                      -.101107    .4003661
        cut4     .7769365    .157401                      .4474924    1.106381

/att2         
        cut1    -.6381621   .1399729                     -.9311287   -.3451955
        cut2     -.145965   .1047595                     -.3652291     .073299
        cut3     .1947528   .1218147                     -.0602083    .4497139
        cut4     .6802471   .1277605                      .4128414    .9476529

/att3         
        cut1    -.6336735   .1615899                     -.9718851   -.2954619
        cut2    -.0653397   .1245075                     -.3259368    .1952574
        cut3      .137493   .1379872                     -.1513175    .4263035
        cut4     1.002957   .2153345                      .5522569    1.453657

/att4        
        cut1    -.5032873   .1089184                     -.7312562   -.2753185
        cut2    -.0219399   .1085907                     -.2492229    .2053431
        cut3     .3856427   .1092128                      .1570576    .6142278
        cut4     1.004283   .1639867                      .6610554    1.347511

/att5         
        cut1    -.8509972   .1335788                     -1.130581   -.5714135
        cut2    -.3128209   .1239232                     -.5721952   -.0534466
        cut3    -.0318078    .107519                     -.2568477    .1932322
        cut4     .4341301    .105858                      .2125667    .6556935

 var(MathAtt)     .9269762   .4206126                      .3586061     2.39618

We find the details of our survey design at the top of the output, and the results are adjusted to account for the sampling weights and clusters (schools).

By the way, we could have specified this model and the sample design from Stata's SEM Builder (shown at the top of this page).

Let's see it work with multilevel models

gsem also fits multilevel models. For instance, we can add a school-level latent variable to our model above and fit a two-level CFA model.

Ignoring the survey nature of the data, we could fit this model with the following gsem:

. gsem (MathAtt Sch[school] -> att1 att2 att3 att4 att5), oprobit

We have added Sch[school], a latent variable that varies across schools but is constant within school.

As before, we can add svy: to gsem to account for the complex survey design. However, because this is a multilevel model, it is no longer sufficient to provide a single sampling weight. Instead, we need weights for each stage of the design. Here, wt_school is the inverse of the probability that a school is included in the sample, and wt_student is the inverse of the probability that a student is selected, conditional on the student's school being selected.

We can specify weights for both stages of our sampling design using svyset,

. svyset school, weight(wt_school) || _n, weight(wt_student)

and add svy: to gsem:

. svy: gsem (MathAtt Sch[school] -> att1 att2 att3 att4 att5), oprobit
(running gsem on estimation sample)


Survey: Generalized structural equation model

Number of strata =  1                                  Number of obs   =   200
Number of PSUs   = 20                                  Population size = 2,976
                                                       Design df       =    19
(output omitted)

 ( 1)  [att1]Sch[school] = 1
 ( 2)  [att2]MathAtt = 1

                           Linearized
               Coefficient  std. err.      t    P>|t|     [95% conf. interval]

att1          
     Sch[school]            1  (constrained)
              
         MathAtt     4.627958    2.66204     1.74   0.098    -.9437558    10.19967

att2          
     Sch[school]     1.266576   1.126628     1.12   0.275    -1.091484    3.624636
              
         MathAtt            1  (constrained)

att3          
     Sch[school]    -.9240938   .6793659    -1.36   0.190    -2.346023    .4978353
              
         MathAtt    -5.521805   2.347199    -2.35   0.030    -10.43455   -.6090606

att4          
     Sch[school]     .6282574   .4128763     1.52   0.145    -.2359026    1.492417
              
         MathAtt    -2.416329   1.219008    -1.98   0.062    -4.967742    .1350847

att5          
     Sch[school]    -1.108373   .4797226    -2.31   0.032    -2.112444   -.1043025
              
         MathAtt     1.322196   1.029228     1.28   0.214    -.8320041    3.476395

/att1          
            cut1    -.7230267   .1401671                       -1.0164   -.4296537
            cut2    -.2467528   .1080061                     -.4728121   -.0206935
            cut3     .1155532   .1282471                     -.1528709    .3839774
            cut4     .7495511   .1624197                      .4096028    1.089499

/att2         
            cut1    -.7027861   .1374404                     -.9904523   -.4151199
            cut2    -.1988136   .0984934                     -.4049627    .0073354
            cut3     .1485485   .1226454                     -.1081512    .4052483
            cut4     .6479889   .1116789                      .4142422    .8817355

/att3         
            cut1    -.5880417   .1480801                     -.8979769   -.2781064
            cut2    -.0314146   .1397569                     -.3239292    .2610999
            cut3     .1667308   .1484827                      -.144047    .4775086
            cut4     1.018614   .2057798                       .587912    1.449316

/att4        
            cut1    -.5406956   .1254541                      -.803274   -.2781173
            cut2    -.0468463   .1077686                     -.2724085     .178716
            cut3     .3728937    .108474                      .1458551    .5999323
            cut4     1.006241    .164661                       .661602    1.350881

/att5         
            cut1    -.8397971   .1465677                     -1.146567   -.5330273
            cut2    -.2791084   .1138474                     -.5173937    -.040823
            cut3      .011116   .1075244                     -.2139352    .2361673
            cut4     .4911015   .1124515                      .2557379    .7264651

            var(                                                                  
    Sch[school])     .0413825   .0336105                      .0075604    .2265105
    var(MathAtt)     .0434685   .0399309                      .0063557    .2972944

As with the previous example, point estimates and standard errors now appropriately account for the complex survey design.

Here is how the model looks when drawn and fit in the SEM Builder.

Although our examples above focus on CFA models, support for complex survey data is available for all models fit by gsem, including one-level and multilevel path models, structural equation models, growth curve models, and more.

Tell me more

Learn more about fitting models with survey data in Structural Equation Modeling Reference Manual.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


		Linearized
		Coefficient std. err. t P>\|t\| [95% conf. interval]

att1
MathAtt		1 (constrained)

att2
MathAtt		.2400526 .1255965 1.91 0.071 -.022824 .5029291

att3
MathAtt		-1.25051 .5295546 -2.36 0.029 -2.358881 -.1421398

att4
MathAtt		-1.25051 .5295546 -2.36 0.029 -2.358881 -.1421398

att5
MathAtt		.2316945 .1016915 2.28 0.034 .0188517 .4445373

/att1
cut1		-.6778102 .1532773 -.9986232 -.3569972
cut2		-.2084441 .1088449 -.436259 .0193708
cut3		.1496295 .1197963 -.101107 .4003661
cut4		.7769365 .157401 .4474924 1.106381

/att2
cut1		-.6381621 .1399729 -.9311287 -.3451955
cut2		-.145965 .1047595 -.3652291 .073299
cut3		.1947528 .1218147 -.0602083 .4497139
cut4		.6802471 .1277605 .4128414 .9476529

/att3
cut1		-.6336735 .1615899 -.9718851 -.2954619
cut2		-.0653397 .1245075 -.3259368 .1952574
cut3		.137493 .1379872 -.1513175 .4263035
cut4		1.002957 .2153345 .5522569 1.453657

/att4
cut1		-.5032873 .1089184 -.7312562 -.2753185
cut2		-.0219399 .1085907 -.2492229 .2053431
cut3		.3856427 .1092128 .1570576 .6142278
cut4		1.004283 .1639867 .6610554 1.347511

/att5
cut1		-.8509972 .1335788 -1.130581 -.5714135
cut2		-.3128209 .1239232 -.5721952 -.0534466
cut3		-.0318078 .107519 -.2568477 .1932322
cut4		.4341301 .105858 .2125667 .6556935

var(MathAtt)		.9269762 .4206126 .3586061 2.39618