Stata
Products Purchase Support Company
Search
   >> Home >> Products >> Capabilities >> Panel data >> Cross-sectional time-series regression

Cross-sectional time-series regression

Stata fits fixed-effects (within), between-effects, and random-effects (mixed) models on balanced and unbalanced data. We use the notation

    y[i,t] = X[i,t]*b + u[i] + v[i,t]

That is, u[i] is the fixed or random effect, and v[i,t] is the pure residual.

xtreg is Stata's cross-sectional time-series regression command. xtreg, fe estimates the parameters of fixed-effects models:

 . webuse nlswork
 (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

 . generate age2 = age^2
 (24 missing values generated)

 . generate ttl_exp2 = ttl_exp^2

 . generate tenure2 = tenure^2
 (433 missing values generated)

 . generate black = race==2

 . xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, fe i(idcode)
 
 Fixed-effects (within) regression               Number of obs      =     28091
 Group variable (i): idcode                      Number of groups   =      4697
 
 R-sq:  within  = 0.1727                         Obs per group: min =         1
	between = 0.3505                                        avg =       6.0
	overall = 0.2625                                        max =        15
	
	                                         F(8,23386)         =    610.12
 corr(u_i, Xb)  = 0.1936                          Prob > F           =    0.0000
   
 ------------------------------------------------------------------------------
      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
	grade |  (dropped)
          age |   .0359987   .0033864    10.63   0.000     .0293611    .0426362
         age2 |   -.000723   .0000533   -13.58   0.000    -.0008274   -.0006186
      ttl_exp |   .0334668   .0029653    11.29   0.000     .0276545     .039279
     ttl_exp2 |   .0002163   .0001277     1.69   0.090    -.0000341    .0004666
       tenure |   .0357539   .0018487    19.34   0.000     .0321303    .0393775
      tenure2 |  -.0019701    .000125   -15.76   0.000    -.0022151   -.0017251
        black |  (dropped)
     not_smsa |  -.0890108   .0095316    -9.34   0.000    -.1076933   -.0703282
        south |  -.0606309   .0109319    -5.55   0.000    -.0820582   -.0392036
        _cons |    1.03732   .0485546    21.36   0.000     .9421497     1.13249
 -------------+----------------------------------------------------------------
      sigma_u |  .35562203
      sigma_e |  .29068923
          rho |  .59946283   (fraction of variance due to u_i)
 ------------------------------------------------------------------------------
 F test that all u_i=0:     F(4696, 23386) =     5.13         Prob > F = 0.0000

The syntax of all estimation commands is the same: the name of the dependent variable is followed by the names of the independent variables.

In this case, the dependent variable, ln_w (log of wage), was modeled as a function of a number of explanatory variables. Note that grade and black were dropped from the model because they do not vary within person.

Our dataset contains 28,091 “observations”, which are 4,697 people, each observed, on average, on 6.0 different years. An observation in our data is a person in a given year. The dataset contains variable idcode, which identifies the persons — the i index in x[i,t]. Before fitting the model, we typed iis idcode to tell Stata this. Told once, Stata remembers.

To fit the corresponding random-effects model, we use the same command but change the fe option to re.

 . xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, re 
 
 Random-effects GLS regression                   Number of obs      =     28091
 Group variable (i): idcode                      Number of groups   =      4697
 
 R-sq:  within  = 0.1715                         Obs per group: min =         1
	between = 0.4784                                        avg =       6.0
	overall = 0.3708                                        max =        15

 Random effects u_i ~ Gaussian                   Wald chi2(10)      =   9244.87
 corr(u_i, X)       = 0 (assumed)                Prob > chi2        =    0.0000
 
 ------------------------------------------------------------------------------
      ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
	grade |   .0646499   .0017811    36.30   0.000     .0611589    .0681408
	  age |    .036806   .0031195    11.80   0.000     .0306918    .0429201
	 age2 |  -.0007133     .00005   -14.27   0.000    -.0008113   -.0006153
      ttl_exp |   .0290207   .0024219    11.98   0.000     .0242737    .0337676
      tl_exp2 |   .0003049   .0001162     2.62   0.009      .000077    .0005327
       tenure |    .039252   .0017555    22.36   0.000     .0358114    .0426927
      tenure2 |  -.0020035   .0001193   -16.80   0.000    -.0022373   -.0017697
	black |  -.0530532   .0099924    -5.31   0.000    -.0726379   -.0334685
     not_smsa |  -.1308263   .0071751   -18.23   0.000    -.1448891   -.1167634
	south |  -.0868927   .0073031   -11.90   0.000    -.1012066   -.0725788
	_cons |   .2387209   .0494688     4.83   0.000     .1417639     .335678
   -----------+----------------------------------------------------------------
      sigma_u |  .25790313
      sigma_e |  .29069544
          rho |  .44043812   (fraction of variance due to u_i)
   ------------------------------------------------------------------------------

We can also perform the Hausman specification test, which compares the consistent fixed-effects model with the efficient random-effects model. To do that, we must first store the results from our random-effects model, refit the fixed-effects model to make those results current, and then perform the test.

 . estimates store random_effects
	
 . quietly xtreg ln_w grade age* ttl_exp* tenure* black not_smsa south, fe 
    
 . hausman . random_effects

	          ---- Coefficients ----
	      |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
	      |       .       random_eff~s    Difference          S.E.
 -------------+----------------------------------------------------------------
	  age |    .0359987      .036806       -.0008073        .0013177
	 age2 |    -.000723    -.0007133       -9.68e-06        .0000184
      ttl_exp |    .0334668     .0290207        .0044461         .001711
     ttl_exp2 |    .0002163     .0003049       -.0000886         .000053
       tenure |    .0357539      .039252       -.0034981        .0005797
      tenure2 |   -.0019701    -.0020035        .0000334        .0000373
     not_smsa |   -.0890108    -.1308263        .0418155        .0062745
        south |   -.0606309    -.0868927        .0262618        .0081346
 ------------------------------------------------------------------------------
	                    b = consistent under Ho and Ha; obtained from xtreg
	     B = inconsistent under Ha, efficient under Ho; obtained from xtreg

     Test:  Ho:  difference in coefficients not systematic

            chi2(8) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                    =      149.44
          Prob>chi2 =      0.0000

In addition, Stata can perform the Breusch and Pagan Lagrange multiplier (LM) test for random effects and can calculate various predictions, including the random effect, based on the estimates.

Equally as important as its ability to fit statistical models with cross-sectional time-series data is Stata's ability to provide meaningful summary statistics.

xtsum reports means and standard deviations in a meaningful way:

 . xtsum hours
	
 Variable         |      Mean   Std. Dev.       Min        Max |    Observations
 -----------------+--------------------------------------------+----------------
 hours    overall |  36.55956   9.869623          1        168 |     N =   28467
	  between |             7.846585          1       83.5 |     n =    4710
	  within  |             7.520712  -2.154726   130.0596 | T-bar = 6.04395

The negative minimum for hours within is not a mistake; the within shows the variation of hours within person around the global mean 36.55956.

xttab does the same for one-way tabulations:

 . xttab msp
	
	           Overall             Between            Within
       msp |    Freq.  Percent      Freq.  Percent        Percent
 ----------+-----------------------------------------------------
         0 |   11324     39.71      3113     66.08          55.06
	 1 |   17194     60.29      3643     77.33          71.90
 ----------+-----------------------------------------------------
     Total |   28518    100.00      6756    143.41          64.14
	                       (n = 4711)

msp is a variable that takes on the value 1 if the surveyed woman is married and the spouse is present in the household. Overall, some 60% of our person-year observations are msp. Taking women individually, 66% of the women are at some point msp, and 77% are not; thus some women are msp one year and not others. Taking women one at a time, if a woman is ever msp, 55% of her observations are msp observations. If a woman is ever not msp, 72% of her observations are not msp. (If marital status never varied in our data, the within percentages would all be 100.)

xttrans reports the transition matrix:

 . xttrans msp

        1 if| 1 if married, spouse present
    married,|
      spouse|
     present|         0          1 |     Total
 -----------+----------------------+----------
          0 |     80.49      19.51 |    100.00
          1 |      7.96      92.04 |    100.00
 -----------+----------------------+----------
       Total|     37.11      62.89 |    100.00

See New in Stata 10 for more about what was added in Stata Release 10.

Stata 10
Overview: Why use Stata?
Stata/MP
64-bit Stata
Capabilities
Overview
Statistics
Basic statistics
Linear models
Multilevel mixed-effects models
Limited dependent variables
Panel data
Cross-sectional TS regression
Generalized estimating equations
GLM
Nonparametric
Exact statistics
ANOVA / MANOVA
Multivariate methods
Cluster analysis
Bootstrapping
Model testing
Survey methods
Survival analysis
Epidemiology tools
Time series
Maximum likelihood
Normality tests
Other methods
Data management
Graphics
Matrix programming—Mata
Programming
Internet capabilities
Y2K
Accessibility
Sample session
New in Stata 10
Supported platforms
Which Stata package?
Technical support
User comments
Products
Stata 10
Order Stata
Upgrade
NetCourses
Bookstore
Stata Journal
Stata Press
Stata News
STB
Stat/Transfer
Gift Shop

Site overview
Products
Resources & support
Company
Site index

© Copyright 1996–2008 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index