Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: RE: st: RE: tsset

From   "Charles Thibault" <>
To   <>
Subject   RE: RE: st: RE: tsset
Date   Thu, 12 Oct 2006 17:07:37 -0400


I was reading the comments below and was having the same problem.

The problem occurs because you cannot specify more than one variable as the
panel idenfifier.  Typing 'help tsset' produces the following explanation:

Declare data to be time series and specify the time variable

        tsset [panelvar] timevar [, options]

INSTEAD, allowing tsset to be:

        tsset (varlist) timevar [, options]

would allow a panel identifier that is composed of more than one variable.

How can one address these types of development issues more directly to the
Stata team?

Thanks for reading,

Charles Thibault
-----Original Message-----
[] On Behalf Of n j cox
Sent: Sunday, May 21, 2006 12:28 PM
Subject: Re: RE: st: RE: tsset

Alexander Nervedi

apologies for any confusion in the way I have been using terms. in my 
mind there is no missing data. the data set clearly tells me that for 
county = x, household = 1, year = 1 variable V1(x,h,t) = x111. The data 
set however does have gaps such as

county household   year V1
x           1              1     12
x           1              2     13
x           1              4     12
x           1              7     12

So without any missing data, I define a uniqe household id using

egen uid = group(county household)

county household   year V1   uid
x           1              1     12   1
x           1              2     13   1
x           1              4     12   1
x           1              7     12   1

I need this so that I am able to tsset my data set.

tsset uid year

Once tsset, I would like to enter the gaps into the dataset, and tsfill 
it for me.

tsfill, full

However, using tsfill creates missing observations whose values i actually
do know. for variables it is a 0 and for identifies like county and
household, it has to be the same value within uid. Thus, my data set looks

county household   year V1   uid
x           1              1     12   1
x           1              2     13   1
.            .              3       .    1
x           1              4     12   1
.            .              5       .    1
.            .              6       .    1
x           1              7     12   1

The coding instructions tell me that V1 = 0 for the missing years. 
I still need to fill in the county and household vairable missings
observations that tsfill created. and currently, I am using a sequence of
replace with leads and lags within uid to fill this. I was hoping there
maybe an automated way of doing this.

thanks for your response.

 >>> This appears to fall under the FAQ

How can I replace missing values with previous or following nonmissing 
How can I replace missing values within sequences?

Note that

. search missing

would have pointed you to this directly.

However, applying the rules appears a little tricky
in your case as

. sort county household

will mess up your sort order. It would seem that

replace county = county[_n-1] if mi(county)
replace household = household[_n-1] if mi(household)

should work.


Nick Cox

 >The effect of your -egen, group()- is
 >to lump all the missings on -county-
 >and/or -household- together. In cases
 >where -household- is missing but not
 >-county-, or vice versa, that throws
 >away some information.
 >-egen, group() missing- will do a bit
 >But the reconstruction of missing data
 >seems somewhere between difficult and
 >impossible, on least on the information
 >you provide.
 >For example, suppose
 >you have -county- but not -household-.
 >There seem two possibilities. The
 >household is in fact one of the other
 >households in the same county in
 >your dataset, or it is not. Do you
 >have any grounds to say which is correct?
 >Conversely, suppose you have -household-
 >but -county-. It may be that your numbering
 >system will enable you to reconstruct the
 >Finally, suppose you have neither -household-
 >nor -county-. If there is a method for
 >imputing, it must be based on the other variables.
 >Alexander Nervedi
 > >
 > > I have panel data with gaps. After tssfill, full i have a
 > > complete data that
 > > but there are many covariates, some string and some numeric,
 > > that become
 > > complete but are actually not. For example.
 > >
 > > egen uid = group(county household)
 > > tsset uid year
 > > tsfill, full
 > >
 > >
 > > will generate missing values for county and household to fill
 > > in the gaps,
 > > even though uid and year are complete. what is a good way to
 > > fill in missing
 > > observations for variables like county and household ?
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index