Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Pavlos C. Symeou" <p.symeou@lmu.de> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: RE: Replacing duplicate values |

Date |
Thu, 01 Apr 2010 17:20:35 +0200 |

Dear Nick and Abdel,

Regards, Pavlos "AbdelRahmen Wrote" "type help duplicates drop under Stata and you will find what you are looking for" On 01/04/2010 17:00, Nick Cox wrote:

It's a Stata two-step: reshape, drop duplicates, reshape back. Something like * warning: untested code reshape long ipc_, i(id) bysort id ipc_: gen superfluousandredundant = _n> 1 drop if superfluousandredundant bysort id (ipc) : gen j = _n reshape wide ipc, i(id) j(j) Actually, the last -reshape- might not be a good idea. The long structure might be more useful. Nick n.j.cox@durham.ac.uk Pavlos C. Symeou I have a dataset which concerns patents. Every patent is assigned a number of International Patent Classifications (IPCs). However, there are mistakes in the database and certain IPCs appear more than once for a single patent, which is meaningless. Examples are patents with id 6 and id 7 (ipc_1, ipc_2 etc list the number of IPCs a single patent is assigned). For the patent with id 6 we can see that ipc_2 and ipc_3 are the same. Id 7 illustrates a more general issue. Duplicate values may not appear sequentially and may appear more than twice. id ipc_1 ipc_2 ipc_3 ipc_4 1 A44B G09F H04N 2 A47B G06F H05K E05D 3 A47B G06F 4 A47B H04N H05K 5 A47B 6 A47B F16M F16M H05K 7 A47B A47B F16M A47B Can you suggest a way to delete the duplicate values, which can be more than two, and move the remaining to the left? For example patents with id 6 and id 7 would look like this: id ipc_1 ipc_2 ipc_3 ipc_4 6 A47B F16M H05K 7 A47B F16M * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Replacing duplicate values***From:*"Pavlos C. Symeou" <p.symeou@lmu.de>

**st: RE: Replacing duplicate values***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**RE: st: RE: RE: RE: LARS ado??** - Next by Date:
**st: AW: RE: Replacing duplicate values** - Previous by thread:
**st: RE: Replacing duplicate values** - Next by thread:
**st: AW: RE: Replacing duplicate values** - Index(es):