Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Replacing duplicate values |

Date |
Thu, 1 Apr 2010 17:59:12 +0100 |

Not really. You can manage fine without -duplicates- here, as my code sketch implied. Nick n.j.cox@durham.ac.uk Abdel Rahmen El Lahga the question was not enough clear and I simply invited Pavlos to try duplicates which is certainly the unvoidable command to perform such task 2010/4/1 Martin Weiss <martin.weiss1@gmx.de>: > Abdel, how would you do that? > > > ************* > clear* > > input byte(id) str4(ipc_1 ipc_2 ipc_3 ipc_4) > 1 A44B G09F H04N > 2 A47B G06F H05K E05D > 3 A47B G06F > 4 A47B H04N H05K > 5 A47B > 6 A47B F16M F16M H05K > 7 A47B A47B F16M A47B > end > > duplicates report ipc_? > ************* > > [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Abdel Rahmen El > Lahga > > type help duplicates drop under Stata and you will find what you are > looking for > 2010/4/1 Pavlos C. Symeou <p.symeou@lmu.de>: >> I would like to ask for your assistance with the following: >> >> I have a dataset which concerns patents. Every patent is assigned a number >> of International Patent Classifications (IPCs). However, there are > mistakes >> in the database and certain IPCs appear more than once for a single > patent, >> which is meaningless. Examples are patents with id 6 and id 7 (ipc_1, > ipc_2 >> etc list the number of IPCs a single patent is assigned). For the patent >> with id 6 we can see that ipc_2 and ipc_3 are the same. Id 7 illustrates > a >> more general issue. Duplicate values may not appear sequentially and may >> appear more than twice. >> >> id ipc_1 ipc_2 ipc_3 ipc_4 >> 1 A44B G09F H04N >> 2 A47B G06F H05K E05D >> 3 A47B G06F >> 4 A47B H04N H05K >> 5 A47B >> 6 A47B F16M F16M H05K >> 7 A47B A47B F16M A47B >> >> Can you suggest a way to delete the duplicate values, which can be more > than >> two, and move the remaining to the left? For example patents with id 6 and >> id 7 would look like this: >> >> id ipc_1 ipc_2 ipc_3 ipc_4 >> 6 A47B F16M H05K >> 7 A47B F16M * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Replacing duplicate values***From:*"Pavlos C. Symeou" <p.symeou@lmu.de>

**Re: st: Replacing duplicate values***From:*Abdel Rahmen El Lahga <rahmen.lahga@gmail.com>

**Re: st: Replacing duplicate values***From:*Abdel Rahmen El Lahga <rahmen.lahga@gmail.com>

- Prev by Date:
**st: running two stata session on the same dataset** - Next by Date:
**st: RE: running two stata session on the same dataset** - Previous by thread:
**Re: st: Replacing duplicate values** - Next by thread:
**RE: st: RE: RE: RE: LARS ado??** - Index(es):