[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: RE: Unique Case ID in Large Panel
I have no solution -- just a note. Both -tsset- and -xtdes- check for repeated time values within a panel, though they use different methods.
by id time: assert _N== 1
while -tsset- performs:
by company year: gen diff = year[_n+1] - year[_n]
sum diff, meanonly
and then checks that the minimum is zero.
I am at a loss to think of a situation where the minimum time difference is not zero yet you would have repeated time values with in a panel.
What message do you get after -tsset- ?
----- Original Message -----
> > My problem is that -xtdes- says my two key variables do not
> > uniquely identify the cases in my large panel data set, but
> > both the Stata
> > code I wrote to identify the duplicates, and SPSS's
> > automatically generated
> > syntax to "find duplicates" (operating on a precursor data
> > set with the
> > same cases and variables) say there are no duplicates.
> > Searching the FAQs on "xtdes" produced no hits.
> > Specifics:
> > 1) I have 1,322,511 cases. The data set is about 700 MB.
> > allocated is 1.4GB (and wasn't it a kick in the fatoozle to
> > discover that's
> > all WinXP will let me allocate from the 4 GB on my system!).
> > 2) I first did -tsset- using DRVNUM (employee id) and CDATE
> > (paycheck date).
> > 3) -xtdes- message:
> > DRVNUM: 10001, 10003, ..., 99051
> > = 28946
> > CDATE: 15344, 15351, ..., 16072
> > = 105
> > Delta(CDATE) = 7; (16072-15344)/7 + 1 = 105
> > (DRVNUM*CDATE does not uniquely identify observations)
* For searches and help try: