Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Tidying up a New and Old ID mapping dataset


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: Tidying up a New and Old ID mapping dataset
Date   Wed, 9 Mar 2011 20:37:55 +0000

Sorry, no extra insights. 

Note to international audience: in British GP is "general practitioner", a non-specialist doctor (*), the first port of call for medical consultations other than accidents and emergencies. 

(*) Meaning, a medic with first degrees in medicine and surgery, sometimes others. Nothing to do with Ph.D.s. Only very, very rarely an M.D. 

Nick 
[email protected] 

Ada Ma

You are right about the trumping rule.   I have over 100 lines of
these mapping rules, I need to sort out this list of rules, because I
need to create a mapping list so that I can merge it to the data sets
I'll be using for analyses.

The data is GP practices.  There are around 1000 of them in Scotland.
They merge / demerge / new GP joins / old GP leaves etc., every time
such an action takes place a new practice ID is given to the practice.
 To follow a practice through years throughout its transformation I
have to bundle several practices together and treat it as a overriding
practice.

Here are two examples of those statements (not real practice numbers):
100033 SPLIT AND BECAME 10066 AND 10077 ON 10/2003 10066 MERGED WITH
10022 AND BECAME 10088 04/2008

10066 MERGED WITH 10022 AND BECAME 10088

I have stripped out all the practice IDs but not sure how to make it
clean, so that I get the mapping right.

On Wed, Mar 9, 2011 at 5:01 PM, Nick Cox <[email protected]> wrote:
> I don't know whether I understand this. The issue appears to be that according to one rule C should be mapped to D and according to another rule D should be mapped to E and that trumps the first rule. And presumably there are other examples this kind. And the example is not to be taken literally, but is schematic.
>
> If that is so, all I can suggest is that the trumping rule is applied last, so that this sounds like -replace- followed by another. I don't know why a loop is thought necessary if there are most two steps.
>
> Nick
> [email protected]
>
> Ada Ma
>
> I have this dataset which has two series of number IDs.  Say it looks like this:
>
> OriginalID    NewID
> A                E
> B                E
> D                E
> C                D
>
>
> I need to map this information to existing data sets, so that all the
> observations A, B, C, D, are mapped to become E.
>
> As you can see it's rather straightforward for the first three
> observations, but for the fourth observation, C is mapped to D.  I
> need to correct this information so that when the NewID is found
> amongst the OriginalID, it is updated to contain the correct NewID.
>
> I need to write a few line of commands that would pick up the fourth
> observation because it's NewID appears as the OriginalID in the third
> observation, and replaces the fourth obs's NewID with the third obs's
> NewID, so that the corrected dataset looks like this.
>
> OriginalID    NewID
> A                E
> B                E
> D                E
> C                E
>
>
> I can write a loop to compare the NewID against every OriginalID in
> the data, but then it will take a few rounds of the looping to get the
> whole thing tidied up, are there any better method?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index