Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: extract rows from panel dataset

From   "Tom Boonen" <[email protected]>
To   [email protected]
Subject   Re: st: extract rows from panel dataset
Date   Sun, 10 Sep 2006 19:21:59 -0400

Hi Kit Baum and Scott,

thanks (again) for your input, this helps me a lot to understand stata
programming better. In my case the utility I am trying to write is in
fact just a preperation command to extract the right matrices from the
panel dataset. Every matrix corresponds to different groups of units
and different time periods. Your code suggestions provides a great
help for this preparation tool.

I need these matrices in a second step to implement my estimator etc.


On 9/10/06, Kit Baum <[email protected]> wrote:
Tom Boonen wants to write a utility that allows the user to specify a
set of units, time periods and variables, and produce a Stata matrix
of the resulting rows and columns of a panel-format dataset. Scott
presented a nice solution for that -- but how is Tom going to use it?
Stata doesn't perform statistical analysis on Stata matrices.

The unnecessary part here, it seems to me, is to subset the dataset
for a particular set of variables. This could be done with preserve
and restore, but that tends to be really slow on a big panel dataset.
Here is my take on the solution:

program drop _all

program foo2 , rclass
   version 9.2
   syntax /* varlist(ts min=2 numeric) */, Timex(numlist >=0 integer)
Units(numlist integer) Gen(string)
   qui tsset
   local tvar `r(timevar)'
   local pvar "`r(panelvar)'"
   local tx: subinstr local timex " " ",", all
   local u: subinstr local units " " ",", all
   gen `gen' = ( inlist(`tvar',`tx') & inlist(`pvar', `u'))

tsset county year

foo2, timex(83(2)87) units(7 9 11 13 19 21) gen(mysamp1)
xtdes if mysamp1

reg avgsen polpc density taxpc if mysamp1

This approach does nothing with variables, but allows you to carry
out any number of analyses on the designated subsample of units and
time periods just by appending the if condition that identifies those
observations. No need to fool around with the variables. If Tom
really wants to create a subset dataset, then reinstate the variable
list (per Scott's code) and have the routiine

keep (only those variables, plus the panel identifiers)
keep if `gen'
save newdatasetname

after the gen `gen' statement.

Kit Baum, Boston College Economics
An Introduction to Modern Econometrics Using Stata:

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index