Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: how to identify strings among which some are abbreviated and group strings which have the same keywords


From   Nina <[email protected]>
To   statalist <[email protected]>
Subject   st: how to identify strings among which some are abbreviated and group strings which have the same keywords
Date   Wed, 9 Nov 2011 16:02:26 +0100

Dear all,

I have two questions to ask for your help. 
The first one:
There is a string variable which defines applicant of patents in my dataset.  I want to identify applicants uniquely, and I use -encode applicant, gen(firm)- to generate a numeric variable to identify them. However, for the same applicant, some of them are in full name and others are abbreviated. For example, 

application number     applicant
1                                   Mcneil consumer
2                                   Mcneil cons

when I use encode, two different identifiers are generated for the same applicant "mcneil consumer". Do you have any suggestions to deal with this case? 

The second one:
The dataset is similar as the above one. And in this case, I want to generate a group id which assign one id for the applicants which is the subsidiaries of a company. For example, as shown in the following data, I want to generate a id which is equal to 1 for application 1&2 because the applicants are from "Mcneil"; while the id is equal to 2 for application 3&4 because they are from Mylan group. 
application number                      applicant
1                                              MCNEIL PEDIATRICS
2                                              MCNEIL CONSUMER HEALTHCARE DIV MCNEIL PPC INC
3                                              MYLAN LABORATORIES INC
4                                              MYLAN PHARMACEUTICALS INC

Any suggestions and comments are more than welcome!
Thank you very much!

Best,
Nina
  



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index