Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: -finddup- for panel?

From   "joe J." <[email protected]>
To   [email protected]
Subject   RE: st: RE: -finddup- for panel?
Date   Tue, 20 Apr 2004 21:46:24 +0000

Thanks Nick.

Here is what I meant.

Panel variable: id, time variable: year

There is a variable y which has missing values and I want to use -cipolate- --the stata-code available at SSC--to interpolate the missing values. I do the interpolation the following way.

tsset id year, yearly
by id : cipolate y year, gen(yci)

It does not run because id has some duplicates, which resulted due to data-entry errors. Therefore I want to remove duplicates for each year and do -cipolate- (the cubic interpolation code at ssc) on the resulting data set with unique ids.

I remove duplicates the following way for each year.

use "C:\data75.dta", clear
finddup id if year==1975, nol k/*finddup is also downloadable from ssc*/
save "C:\data75a.dta", replace

drop if dupval>=2/*removing duplicates*/
save "C:\data75b.dta", replace/*data with unique ids*/

by id : cipolate y year, gen(yci)/*cubic interpolation*/
save "C:\data75c.dta", replace

use "C:\data75a.dta", clear
keep if dupval>=2/*collecting duplicates*/
save "C:\data75d.dta", replace

I repeat the above steps for other years and at the end append the interpolated and duplicate files for each year.

use "C:\data75c.dta", clear
append using "C:\data75d.dta"
append using "C:\data76c.dta"
append using "C:\data76d.dta"
My question is , is there any way of detecting duplicate ids for all years simaltaneosly instead of doing it for each year sepearately. (I wish I could do it the following way
by year: finddup id , nol k).


From: "Nick Cox" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: RE: -finddup- for panel?
Date: Tue, 20 Apr 2004 17:32:05 +0100

Please show us an example of what you mean.

[email protected]

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of joe J.
> Sent: 20 April 2004 17:22
> To: [email protected]
> Subject: st: -finddup- for panel?
> Hello,
> I have a panel data set, where there are duplicates of the
> panel variable.
> But I want to retain them because the mistake is only in the
> panel variable,
> not in other variables in the observation. (For my analysis, I would
> aggregate the data so that the panel variable is no longer relevant.)
> BUT, I keep the panel format in the beginning in order to
> fill in missing
> values using interpolation. To do this I want to keep out
> observations with
> the duplicate panel variable and append them back after
> interpolation has
> been done. For a non-panel data, I could use the stata-code
> -FINDDUP- (by
> Fred Wolfe downloadable from SSC). -finddup- helps to take
> out a particular
> duplicate observation and save it for later use. But it does
> not work in a
> panel set up. Is there any way around, other than doing it
> sepeartely for
> each year?

*   For searches and help try:
The latest games �n gizmos. Tech news �n downloads. All this at a single click.

* For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index