Re: st: generate a tscs pseudo-population for mc experiment

 From [email protected] (William Gould, Stata) To [email protected] Subject Re: st: generate a tscs pseudo-population for mc experiment Date Mon, 07 Jun 2004 09:31:19 -0500

```Vera Troeger <[email protected]> asked,

> I want to do a Monte Carlo experiment and need to generate a
> pseudo-population that has a panel structure (tscs).  how can I generate a
> random variable x_it with i cross-sections and t timeperiods?

Let's distingish between two models,

Y_it = X1_i*b1 + X2_t*b2 + X3_it*b3 + u_i + u_t + u_ij        (1)

and

Y_it = X1_i*b1 +           X3_it*b3 + u_i +       u_ij        (2)

For most of the simulations I have done, (2) is good enough, so let me start
there and then move to (1).

Model 2
-------

The basic outline for creating a model-2 dataset is to create a
cross-sectional dataset (one obs. per i), fill in X1_i and u_i, then -expand-
the dataset (so that there are, say, 5*i obs.), and fill in the rest.

For instance, say we want to create a dataset of 500 panels (i=1, 2, ..., 500)
and 10 time periods (t=1, 2, ..., 10):

. drop _all
. set obs 50
. gen i = _n
. gen x1 = uniform()
. gen u_i = 2*invnorm(uniform())

. expand 10
. sort i
. by i: gen t = _n
. gen x3 = uniform()
. gen u_it = 3*invnorm(uniform())

. gen y = x1*1 + x3*2 + u_i + u_it

There are lots of variations on the above; you may want to have multiple
x1 and/or x3 variables and you may want them correlated, but in all cases,
the basic idea is the same.  Make a cross-sectional dataset, fill it in,
and then add the time-series details.

Model 1
-------

Simulating the full model is just a little more difficult than simulating
model 2.

The way to proceed is, prior to making the cross-sectional dataset, make
a time-series dataset.  Then following the outline for model (2).  At
the end, -merge- the time-series dataset you previously constructed.

Here's how to make the time-series dataset:

. drop _all
. set obs 10
. gen t = _n

Now we can generate X2_t variables and the u_t variable.
Often, you will want to make X2_t follow a process, such as

X2_t = constant + alpha*X2_t-1 + noise

or perhaps X2_t is a function of t, as well.  Anyway,

. gen x2 = .
. replace x2 = 1 in 1
. replace x2 =
. gen x2 = 4 + .2*x2[_n-1] + 2*invnorm(uniform())

Sometimes a simple u_t is all that is necessary

. gen u_t = invnorm(uniform())

and sometimes you will want to put a process on that, too.  Anyway, make the
x2 and u_t variables.  Once ou have the time-series dataset, sort it by t and
save it:

. sort t
. save ts, replace

Now make the cross-sectional dataset,

. drop _all
. set obs 50
. gen i = _n
. gen x1 = uniform()
. gen u_i = 2*invnorm(uniform())

use -expand- to convert the cross-sectional dataset into a panel, and
generate t,

. expand 10
. sort i
. by i: gen t = _n

and now, here is the new part:  merge in ts.dta previously created:

. sort t
. merge t using ts
. sort i t

Now you can create y and do whatever else you need.  For instance, perhaps
you want unbalanced panels.  Then drop some of the observations.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```