In the spotlight: So many fixed effects, so little time

Do you work with panel data or cross-sectional data and need to control for high-dimensional categorical variables? Stata 19 introduced powerful enhancements to its regression capabilities that make it easier and much faster to fit models with high-dimensional fixed effects (HDFE).

The absorb() option is now available in xtreg, fe and ivregress 2sls, letting you efficiently absorb multiple categorical variables at the same time. This greatly expands what’s possible with fixed-effects modeling. In areg, absorb() has also been enhanced to handle multiple categorical variables. And because absorbed effects are omitted from the output, you avoid results you don’t need while still getting fast and efficient estimation.

An example using individual-level panel data

Let’s walk through a concrete example using individual-level panel data from the IPUMS Current Population Survey (CPS), available at https://cps.ipums.org/cps/. The dataset contains repeated observations of individuals over time, with information on each individual's occupation and industry at each time period.

We begin by loading and describing the dataset.

. use ipums
(IPUMS CPS extract, 2015 through 2024)

. describe

Contains data from ipums.dta
 Observations:       170,434                  IPUMS CPS extract, 2015 through 2024
    Variables:            21                  4 Sep 2025 12:57



Variable      Storage   Display    Value                                                            
    name         type    format    label      Variable label                                        

id              float   %9.0g                 Individual identification                             
year            int     %8.0g                 Survey year                                           
cpsid           double  %12.0g                Household record                                      
asecflag        byte    %8.0g      ASECFLAG   Annual Social and Economic Supplement                 
region          byte    %8.0g      REGION     Region and division                                   
statecensus     byte    %8.0g      STATECENSUS                                                      
                                              State census code                                     
cpsidp          double  %12.0g                Person record                                         
age             byte    %8.0g                 Age                                                   
sex             byte    %8.0g      SEX        Sex                                                   
race            int     %8.0g      RACE       Race                                                  
marst           byte    %23.0g     MARST      Marital status                                        
vetstat         byte    %8.0g      VETSTAT    Veteran status                                        
empstat         byte    %30.0g     EMPSTAT    Employment status                                     
occ             int     %8.0g                 Occupation                                            
ind             int     %8.0g                 Industry                                              
uhrsworkt       int     %8.0g                 Hours usually worked per week at all jobs             
educ            int     %8.0g      EDUC       Educational attainment recode                         
wksunem2        byte    %8.0g                 Weeks unemployed last year, intervaled                
incwage         long    %12.0g                Wage and salary income                                
health          byte    %9.0g      HEALTH     Health status                                         
lnwage          float   %9.0g                 Natural log of wage and salary income                 

Sorted by: id  year

We aim to model the log of wages (lnwage) as a function of age, employment status, marital status, hours worked, weeks unemployed, and health condition. We also want to control for the individual, time period, occupation, and industry.

Step 1: Start simple—no fixed effects

First, we establish a baseline by fitting a basic ordinary least-squares regression without any fixed effects. This model ignores any unobserved characteristics that are constant across individuals, years, occupations, and industries.

. regress lnwage c.age##c.age uhrsworkt wksunem2 i.marst i.empstat i.health
(Output omitted)

We use estimates store to store our estimation results so that we can use them later.

. estimates store OLS

Step 2: One-way fixed effects with xtreg, fe

Next, we improve our model by including individual fixed effects using xtreg, fe. These individual fixed effects control for all time-invariant and unobserved factors, such as innate ability or motivation. As a prerequisite, we first declare the panel structure, an essential step before using xt commands like xtreg.

. xtset id year


Panel variable: id (unbalanced)
 Time variable: year, 2015 to 2024, but with gaps
         Delta: 1 unit

. xtreg lnwage c.age##c.age uhrsworkt wksunem2 i.marst i.empstat i.health, fe nolog

Fixed-effects (within) regression               Number of obs     =     80,680
Group variable: id                              Number of groups  =     19,021

R-squared:                                      Obs per group:
     Within  = 0.3634                                         min =          1
     Between = 0.3598                                         avg =        4.2
     Overall = 0.3623                                         max =         10

                                                F(20, 61639)      =    1759.30
corr(u_i, Xb) = -0.0069                         Prob > F          =     0.0000



                         lnwage   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
                            age     .1157152   .0016753    69.07   0.000     .1124316    .1189989
                                 
                    c.age#c.age    -.0011639   .0000182   -63.88   0.000    -.0011996   -.0011282
                                 
                      uhrsworkt    -.0000866   .0000179    -4.83   0.000    -.0001216   -.0000515
                       wksunem2     .1025737   .0012164    84.33   0.000     .1001895    .1049578
                                 
                          marst  
        Married, spouse absent       -.19975    .031977    -6.25   0.000    -.2624251    -.137075
                     Separated     -.3952554   .0276942   -14.27   0.000    -.4495361   -.3409746
                      Divorced     -.2076364   .0131721   -15.76   0.000    -.2334538    -.181819
                       Widowed     -.3332016   .0287973   -11.57   0.000    -.3896444   -.2767588
          Never married/single     -.2313562   .0104761   -22.08   0.000    -.2518894   -.2108231
                                 
                        empstat  
                       At work     -.3255529   .0495945    -6.56   0.000    -.4227582   -.2283477
Has job, not at work last week     -.3600129   .0541405    -6.65   0.000    -.4661284   -.2538974
Unemployed, experienced worker     -.6734508   .0517662   -13.01   0.000    -.7749126    -.571989
        Unemployed, new worker     -1.190159   .2537126    -4.69   0.000    -1.687436   -.6928814
          NILF, unable to work     -.9811457    .074068   -13.25   0.000    -1.126319   -.8359722
                   NILF, other     -1.125918   .0505666   -22.27   0.000    -1.225029   -1.026807
                 NILF, retired     -.6796395   .0570466   -11.91   0.000    -.7914509   -.5678281
                                 
                         health  
                     Very good     -.0600062   .0092826    -6.46   0.000    -.0782002   -.0418122
                          Good     -.1977527    .010219   -19.35   0.000    -.2177819   -.1777235
                          Fair     -.3452917   .0176644   -19.55   0.000    -.3799139   -.3106695
                          Poor     -.3686544   .0400975    -9.19   0.000    -.4472456   -.2900632
                                 
                          _cons     7.719378   .0613174   125.89   0.000     7.599196     7.83956

                        sigma_u    .52868441                                                     
                        sigma_e    .93476059                                                     
                            rho    .24235753   (fraction of variance due to u_i)                 

F test that all u_i=0: F(19020, 61639) = 1.01                Prob > F = 0.2222

. estimates store Oneway

As shown by the joint F test, we do not find evidence that we need to control for individual fixed effects.

Step 3: Two-way fixed effects with absorb()

We extend the model by adding year fixed effects through the absorb() option, capturing time-specific unobservables, such as macroeconomic conditions and policy changes. This yields a two-way fixed-effects specification.

. xtreg lnwage c.age##c.age uhrsworkt wksunem2 i.marst i.empstat i.health, fe absorb(year) nolog
Alternating projection maximum absolute difference = 8.707e-09 

Fixed-effects (within) regression               Number of obs     =     80,680
Group variable: id                              Number of groups  =     19,021

R-squared:                                      Obs per group:
     Within  = 0.3761                                         min =          1
     Between = 0.3595                                         avg =        4.2
     Overall = 0.3622                                         max =         10

                                                F(20, 61630)      =    1792.10
corr(u_i, Xb) = -0.0077                         Prob > F          =     0.0000



Absorbed variable   Levels
   
             year       10




                         lnwage   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
                            age     .1155189   .0016591    69.63   0.000     .1122671    .1187707
                                 
                    c.age#c.age    -.0011632    .000018   -64.47   0.000    -.0011985   -.0011278
                                 
                      uhrsworkt      -.00008   .0000177    -4.51   0.000    -.0001148   -.0000453
                       wksunem2     .1025185   .0012062    85.00   0.000     .1001545    .1048826
                                 
                          marst  
        Married, spouse absent     -.2096018   .0316645    -6.62   0.000    -.2716643   -.1475393
                     Separated     -.3875132    .027422   -14.13   0.000    -.4412603    -.333766
                      Divorced     -.2020962   .0130424   -15.50   0.000    -.2276593   -.1765331
                       Widowed     -.3300978   .0285125   -11.58   0.000    -.3859824   -.2742133
          Never married/single     -.2390184   .0103748   -23.04   0.000    -.2593532   -.2186837
                                 
                        empstat  
                       At work     -.3119542   .0491056    -6.35   0.000    -.4082012   -.2157072
Has job, not at work last week     -.3477729   .0536078    -6.49   0.000    -.4528442   -.2427016
Unemployed, experienced worker     -.6544863   .0512603   -12.77   0.000    -.7549566   -.5540161
        Unemployed, new worker     -1.141233   .2512077    -4.54   0.000      -1.6336   -.6488649
          NILF, unable to work     -.9626481    .073338   -13.13   0.000    -1.106391   -.8189054
                   NILF, other     -1.114173   .0500685   -22.25   0.000    -1.212307   -1.016039
                 NILF, retired     -.6796395   .0570466   -11.91   0.000    -.7914509   -.5678281
                                 
                         health  
                     Very good     -.0698841   .0091966    -7.60   0.000    -.0879094   -.0518587
                          Good     -.2106014    .010126   -20.80   0.000    -.2304485   -.1907544
                          Fair     -.3599402   .0174966   -20.57   0.000    -.3942335   -.3256469
                          Poor     -.3807171   .0397028    -9.59   0.000    -.4585347   -.3028994
                                 
                          _cons     7.721496   .0607112   127.18   0.000     7.602502    7.840491

                        sigma_u    .52355994                                                     
                        sigma_e    .92547327                                                     
                            rho    .24244753   (fraction of variance due to u_i)                 

F test that all u_i=0: F(19020, 61630) = 1.10                Prob > F = 0.0000

. estimates store Twoway

Here individual-level fixed effects are captured automatically by the fe option (because xtset declared id as the panel dimension), while the absorb(year) option efficiently controls for all time effects. After including year fixed effects, the coefficients on the variables of interest do not change much, but we now have evidence that we should control for the individual fixed effects based on the joint F test at the bottom of the output.

Step 4: Go full HDFE

We also want to control for occupation and industry effects, so we add these variables in the absorb() option. If we were to add them as regressors, we would have almost 900 parameters to estimate. We do not care about the parameter estimates, but we need to control for the effects of, for instance, technological shifts in a particular industry or the unique skill demands of a specific occupation, which would bias our estimates if they were omitted.

. xtreg lnwage c.age##c.age uhrsworkt wksunem2 i.marst i.empstat i.health, fe 
     absorb(year occ ind) nolog
Alternating projection maximum absolute difference = 3.233e-09 

Fixed-effects (within) regression               Number of obs     =     80,680
Group variable: id                              Number of groups  =     19,021

R-squared:                                      Obs per group:
     Within  = 0.5089                                         min =          1
     Between = 0.3542                                         avg =        4.2
     Overall = 0.3564                                         max =         10

                                                F(20, 60721)      =     901.00
corr(u_i, Xb) = -0.0038                         Prob > F          =     0.0000




Absorbed variable   Levels
   
             year       10
              occ      631
              ind      280




                         lnwage   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
                            age     .0868007   .0015374    56.46   0.000     .0837873    .0898141
                                 
                    c.age#c.age    -.0008721   .0000167   -52.36   0.000    -.0009047   -.0008394
                                 
                      uhrsworkt    -.0000613   .0000161    -3.82   0.000    -.0000928   -.0000298
                       wksunem2     .0861902   .0011058    77.94   0.000     .0840229    .0883576
                                 
                          marst  
        Married, spouse absent     -.0656064    .028564    -2.30   0.022    -.1215919   -.0096209
                     Separated     -.1792339   .0247726    -7.24   0.000    -.2277884   -.1306795
                      Divorced     -.0986945   .0118113    -8.36   0.000    -.1218447   -.0755443
                       Widowed     -.1634434   .0257243    -6.35   0.000    -.2138631   -.1130237
          Never married/single      -.108238   .0094786   -11.42   0.000    -.1268161   -.0896598
                                 
                        empstat  
                       At work     -.8150478   .0594023   -13.72   0.000    -.9314766   -.6986191
Has job, not at work last week     -.8254109   .0624995   -13.21   0.000      -.94791   -.7029117
Unemployed, experienced worker     -1.079656   .0608426   -17.75   0.000    -1.198907   -.9604038
        Unemployed, new worker     -1.555114   .2248653    -6.92   0.000     -1.99585   -1.114377
          NILF, unable to work     -1.200687   .0659515   -18.21   0.000    -1.329952   -1.071422
                   NILF, other     -1.373513   .0451674   -30.41   0.000    -1.462042   -1.284985
                 NILF, retired     -.8826988   .0509092   -17.34   0.000     -.982481   -.7829167
                                 
                         health  
                     Very good      -.025719    .008297    -3.10   0.002    -.0419811   -.0094569
                          Good     -.0958429   .0092073   -10.41   0.000    -.1138892   -.0777967
                          Fair     -.1874372   .0158488   -11.83   0.000    -.2185008   -.1563736
                          Poor     -.2139877   .0357415    -5.99   0.000    -.2840412   -.1439343
                                 
                          _cons     8.831137   .0669693   131.87   0.000     8.699876    8.962397

                        sigma_u    .47258634                                                     
                        sigma_e    .82718913                                                     
                            rho     .2460807   (fraction of variance due to u_i)                 

F test that all u_i=0: F(19020, 60721) = 2.23                Prob > F = 0.0000

. estimates store Fourway

This model now controls for fixed effects across four dimensions: individual, year, occupation, and industry.

Compare the results

We can compare the results across these models to see how our estimates change with additional fixed effects. We begin by creating a table to compare the coefficients.

. etable, estimates(OLS Oneway Twoway Fourway) column(estimates)



                                            OLS    Oneway  Twoway Fourway

Age                                         0.115   0.116   0.116   0.087
                                          (0.001) (0.002) (0.002) (0.002)
Age # Age                                  -0.001  -0.001  -0.001  -0.001
                                          (0.000) (0.000) (0.000) (0.000)
Hours usually worked per week at all jobs  -0.000  -0.000  -0.000  -0.000
                                          (0.000) (0.000) (0.000) (0.000)
Weeks unemployed last year, intervaled      0.102   0.103   0.103   0.086
                                          (0.001) (0.001) (0.001) (0.001)
Marital status                                                           
  Married, spouse absent                   -0.168  -0.200  -0.210  -0.066
                                          (0.028) (0.032) (0.032) (0.029)
  Separated                                -0.413  -0.395  -0.388  -0.179
                                          (0.024) (0.028) (0.027) (0.025)
  Divorced                                 -0.202  -0.208  -0.202  -0.099
                                          (0.012) (0.013) (0.013) (0.012)
  Widowed                                  -0.317  -0.333  -0.330  -0.163
                                          (0.025) (0.029) (0.029) (0.026)
  Never married/single                     -0.222  -0.231  -0.239  -0.108
                                          (0.009) (0.010) (0.010) (0.009)
Employment status                                                        
  At work                                  -0.297  -0.326  -0.312  -0.815
                                          (0.043) (0.050) (0.049) (0.059)
  Has job, not at work last week           -0.361  -0.360  -0.348  -0.825
                                          (0.047) (0.054) (0.054) (0.062)
  Unemployed, experienced worker           -0.670  -0.673  -0.654  -1.080
                                          (0.045) (0.052) (0.051) (0.061)
  Unemployed, new worker                   -1.494  -1.190  -1.141  -1.555
                                          (0.219) (0.254) (0.251) (0.225)
  NILF, unable to work                     -0.935  -0.981  -0.963  -1.201
                                          (0.065) (0.074) (0.073) (0.066)
  NILF, other                              -1.086  -1.126  -1.114  -1.374
                                          (0.044) (0.051) (0.050) (0.045)
  NILF, retired                            -0.626  -0.680  -0.662  -0.883
                                          (0.050) (0.057) (0.056) (0.051)
Health status                                                            
  Very good                                -0.059  -0.060  -0.070  -0.026
                                          (0.008) (0.009) (0.009) (0.008)
  Good                                     -0.190  -0.198  -0.211  -0.096
                                          (0.009) (0.010) (0.010) (0.009)
  Fair                                     -0.354  -0.345  -0.360  -0.187
                                          (0.015) (0.018) (0.017) (0.016)
  Poor                                     -0.377  -0.369  -0.381  -0.214
                                          (0.035) (0.040) (0.040) (0.036)
Intercept                                   7.692   7.719   7.721   8.831
                                          (0.053) (0.061) (0.061) (0.067)
Number of observations                      80680   80680   80680   80680

Notably, even the two-way fixed-effects estimates differ considerably from the four-way fixed-effects results. These differences can be illustrated more clearly with a coefficient plot. We will use the community-contributed command coefplot (Jann [2014] and Jann [2015]), which we first install from the Statistical Software Components (SSC) Archive. Learn more about features produced by the Stata community at Community-contributed features.

. ssc install coefplot, replace

. coefplot Twoway Fourway, drop(_cons) xline(0) msymbol(D) mfcolor(white) msize(tiny) 
     legend(order(2 "Two-way" 4 "Four-way"))

Many of the estimated coefficients vary considerably across models. This indicates that unobserved heterogeneity at the occupation and industry levels, beyond individual and time effects, can still introduce significant bias if not properly accounted for.

But how much time do these models take to fit? On my machine, the two-way fixed-effects model ran in 0.3 seconds, and the full four-way specification ran in just 0.7 seconds. A key advantage of using the absorb() option for fixed effects is the substantial gain in computational efficiency. Instead of directly including thousands of indicator variables—in this case over 19,000 individual effects, 10 year effects, 630 occupation effects, and 200 industry effects—the absorb() option processes them internally.

Final thoughts

With the enhanced absorb() option, fitting HDFE models has never been easier. You get cleaner output, faster estimation, and the flexibility to control for multiple categorical variables all with one option for xtreg, areg, or ivregress 2sls.

References

Jann, B. 2014. Plotting regression coefficients and other estimates. Stata Journal 14(4): 708–737. https://doi.org/10.1177/1536867X1401400402.

Jann, B. 2015. Software updates: gr0059_1. Stata Journal 15(1): 324. https://doi.org/10.1177/1536867X1501500122.

Correia, S. 2016. A feasible estimator for linear models with multi-way fixed effects. Unpublished manuscript, Duke University, https://scorreia.com/research/hdfe.pdf.

IPUMS CPS, University of Minnesota, www.ipums.org.

— Chris Cheng
Senior Econometrician

— Bingsheng Zhang
Senior Mathematician and Statistician

«Back to main page