Thank you, that piece of code picked up nil/no/denies cases but it also picked up some cases where nil/no/denies was completely irrelevant to the term diarrhoea as these terms preceeded a large amount of text prior to the occurrence of diarrhoea. Is there any way you can refine the code to limit the number of characters/spaces that occur between the terms nil/no/denies and diarrhoea?
Tove Fitzgerald
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ryan Kessler
Sent: Monday, 14 January 2013 12:03 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: How to identify multiple substrings within a string
input str30 string
"diarrhoea"
"nil diarrhoea"
"no vomiting/diarrhoea"
"denies diarrhoea/vomiting"
end
gen nodi = regexm(lower(string), "(nil |no |denies )(.*)(diarrhoea)") tab nodi
Best,
Ryan Kessler
On Sun, Jan 13, 2013 at 6:37 PM, Michelle T. Butler <Michelle.Butler@hnehealth.nsw.gov.au> wrote:
> Hi all, I am searching a string variable for cases who don't have diarrhoea.I need to identify records where the terms nil, no or denies preceeds the term diarrhoea in the same sentence.I have already identified that these terms do not always immediately preceed diarrhoea eg. No vomiting/diarrhoea, so I am looking for a way to extract all observations where nil, no, denies occurs in close proximity to diarrhoea, ignoring spelling errors/upper/lower case variations etc.Thank you for your help,Tove Fitzgerald.
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/