Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: -finddup- for panel?


From   "joe J." <otharain@hotmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: RE: -finddup- for panel?
Date   Tue, 20 Apr 2004 21:46:24 +0000

Thanks Nick.

Here is what I meant.

Panel variable: id, time variable: year

There is a variable y which has missing values and I want to use -cipolate- --the stata-code available at SSC--to interpolate the missing values. I do the interpolation the following way.

tsset id year, yearly
by id : cipolate y year, gen(yci)

It does not run because id has some duplicates, which resulted due to data-entry errors. Therefore I want to remove duplicates for each year and do -cipolate- (the cubic interpolation code at ssc) on the resulting data set with unique ids.

I remove duplicates the following way for each year.

use "C:\data75.dta", clear
finddup id if year==1975, nol k/*finddup is also downloadable from ssc*/
save "C:\data75a.dta", replace

drop if dupval>=2/*removing duplicates*/
save "C:\data75b.dta", replace/*data with unique ids*/

by id : cipolate y year, gen(yci)/*cubic interpolation*/
save "C:\data75c.dta", replace


use "C:\data75a.dta", clear
keep if dupval>=2/*collecting duplicates*/
save "C:\data75d.dta", replace

I repeat the above steps for other years and at the end append the interpolated and duplicate files for each year.

use "C:\data75c.dta", clear
append using "C:\data75d.dta"
append using "C:\data76c.dta"
append using "C:\data76d.dta"
etc.
My question is , is there any way of detecting duplicate ids for all years simaltaneosly instead of doing it for each year sepearately. (I wish I could do it the following way
by year: finddup id , nol k).

Joe


From: "Nick Cox" <n.j.cox@durham.ac.uk>
Reply-To: statalist@hsphsun2.harvard.edu
To: <statalist@hsphsun2.harvard.edu>
Subject: st: RE: -finddup- for panel?
Date: Tue, 20 Apr 2004 17:32:05 +0100

Please show us an example of what you mean.

Nick
n.j.cox@durham.ac.uk

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of joe J.
> Sent: 20 April 2004 17:22
> To: statalist@hsphsun2.harvard.edu
> Subject: st: -finddup- for panel?
>
>
> Hello,
>
> I have a panel data set, where there are duplicates of the
> panel variable.
> But I want to retain them because the mistake is only in the
> panel variable,
> not in other variables in the observation. (For my analysis, I would
> aggregate the data so that the panel variable is no longer relevant.)
>
> BUT, I keep the panel format in the beginning in order to
> fill in missing
> values using interpolation. To do this I want to keep out
> observations with
> the duplicate panel variable and append them back after
> interpolation has
> been done. For a non-panel data, I could use the stata-code
> -FINDDUP- (by
> Fred Wolfe downloadable from SSC). -finddup- helps to take
> out a particular
> duplicate observation and save it for later use. But it does
> not work in a
> panel set up. Is there any way around, other than doing it
> sepeartely for
> each year?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
The latest games n gizmos. Tech news n downloads. http://www.msn.co.in/Computing/ All this at a single click.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index