Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: generate a tscs pseudo-population for mc experiment

From   [email protected] (William Gould, Stata)
To   [email protected]
Subject   Re: st: generate a tscs pseudo-population for mc experiment
Date   Mon, 07 Jun 2004 09:31:19 -0500

Vera Troeger <[email protected]> asked,

> I want to do a Monte Carlo experiment and need to generate a
> pseudo-population that has a panel structure (tscs).  how can I generate a
> random variable x_it with i cross-sections and t timeperiods?

Let's distingish between two models, 

     Y_it = X1_i*b1 + X2_t*b2 + X3_it*b3 + u_i + u_t + u_ij        (1)


     Y_it = X1_i*b1 +           X3_it*b3 + u_i +       u_ij        (2)

For most of the simulations I have done, (2) is good enough, so let me start
there and then move to (1).

Model 2

The basic outline for creating a model-2 dataset is to create a
cross-sectional dataset (one obs. per i), fill in X1_i and u_i, then -expand-
the dataset (so that there are, say, 5*i obs.), and fill in the rest.

For instance, say we want to create a dataset of 500 panels (i=1, 2, ..., 500)
and 10 time periods (t=1, 2, ..., 10):

        . drop _all
        . set obs 50
        . gen i = _n 
        . gen x1 = uniform()
        . gen u_i = 2*invnorm(uniform())

        . expand 10 
        . sort i 
        . by i: gen t = _n 
        . gen x3 = uniform()
        . gen u_it = 3*invnorm(uniform())

        . gen y = x1*1 + x3*2 + u_i + u_it

There are lots of variations on the above; you may want to have multiple
x1 and/or x3 variables and you may want them correlated, but in all cases, 
the basic idea is the same.  Make a cross-sectional dataset, fill it in, 
and then add the time-series details.

Model 1

Simulating the full model is just a little more difficult than simulating 
model 2.

The way to proceed is, prior to making the cross-sectional dataset, make 
a time-series dataset.  Then following the outline for model (2).  At 
the end, -merge- the time-series dataset you previously constructed.

Here's how to make the time-series dataset:

        . drop _all
        . set obs 10
        . gen t = _n

Now we can generate X2_t variables and the u_t variable.
Often, you will want to make X2_t follow a process, such as 

        X2_t = constant + alpha*X2_t-1 + noise 

or perhaps X2_t is a function of t, as well.  Anyway, 

        . gen x2 = . 
        . replace x2 = 1 in 1
        . replace x2 = 
        . gen x2 = 4 + .2*x2[_n-1] + 2*invnorm(uniform())

Sometimes a simple u_t is all that is necessary

        . gen u_t = invnorm(uniform())

and sometimes you will want to put a process on that, too.  Anyway, make the
x2 and u_t variables.  Once ou have the time-series dataset, sort it by t and
save it:

        . sort t
        . save ts, replace 

Now make the cross-sectional dataset,

        . drop _all
        . set obs 50
        . gen i = _n 
        . gen x1 = uniform()
        . gen u_i = 2*invnorm(uniform())

use -expand- to convert the cross-sectional dataset into a panel, and 
generate t, 

        . expand 10 
        . sort i 
        . by i: gen t = _n 

and now, here is the new part:  merge in ts.dta previously created:

        . sort t 
        . merge t using ts 
        . sort i t 

Now you can create y and do whatever else you need.  For instance, perhaps 
you want unbalanced panels.  Then drop some of the observations.  

-- Bill
[email protected]
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index