[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Howard Lempel <HLempel@brookings.edu> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Categorizing HIV status using a series of string variables |

Date |
Mon, 24 Nov 2008 21:17:17 -0500 |

Chelsea, I haven't used regular expressions in a bit, so someone should correct me if I'm wrong, but I think the problem you mention would be solved by replacing Tom's first line of code with: gen group = (regexm(HIV, "N[\.I]*P")) The expression in the brackets tells Stata to ignore "."s and "I"s when it looks for an N followed by a P. For more info on using regular expressions, check here: http://www.stata.com/support/faqs/data/regex.html. -help regexm- will also be useful. Also, I don't know if you have any cases where someone is indeterminate or missing in every period. If you have any such cases, I think Tom's code will code those as group 3, which does not seem appropriate. You may want to add a line of code as follows: replace group = 4 if !regexm(HIV, "N") & !regexm(HIV, "P") You will also want value labels for your variable. The following code should -label- your group variable. lab define grouplab 1 "incident seroconverter" 2 "prevalent positive" /// 3 "consistently seronegative" 4 "missing/indeterminate" lab val group grouplab Lastly, I'd like to warn you that I've had some trouble with the way that Stata's -regexm- function deals with missing values. If you have any truly missing values (i.e. ""), I would carefully check to make sure that the -regexm- function is dealing with them in the right way. See this thread for the problem I had: http://www.stata.com/statalist/archive/2008-10/msg00935.html Hope this helps Howie Howie Lempel Research Assistant The Brookings Institution | Economic Studies 1775 Massachusetts Ave NW | Washington DC 20036 hlempel@brookings.edu | p: (202) 238-3576 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Polis, Chelsea B. Sent: Monday, November 24, 2008 8:49 PM To: statalist@hsphsun2.harvard.edu Subject: RE: st: Categorizing HIV status using a series of string variables Many thanks, Tom and Howie! Tom: Your solution worked beautifully, except for one tiny thing. I got a few people who weren't assigned to one of the three groups, and their codes all had one thing in common...an "indeterminate" test between their negative and positive tests: hiv ......NI..PP .....NIP.... ......NI...P I can very easily just recode these people to be incident seroconverters, but I wonder if there is an easy fix for the code that would do this automatically? Howie: Many thanks for the explanation...that clears up a lot of my confusion! BTW: my apologies if this post ends up in the wrong spot - I'm still trying to understand how to reply to individual postings when I receive Statalist in digest form...I'm hoping that slapping a "RE:" in front of the subject line I wish to respond to will allow me to do that, but I couldn't find information in the FAQ on specifically how to do this. Cheers, Chelsea * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Categorizing HIV status using a series of string variables***From:*"Tom Trikalinos" <ttrikalin@gmail.com>

**References**:**RE: st: Categorizing HIV status using a series of string variables***From:*"Polis, Chelsea B." <cpolis@jhsph.edu>

- Prev by Date:
**RE: st: Categorizing HIV status using a series of string variables** - Next by Date:
**st: tabstatmat: are there difference between lab version vs non-lab versions? or a bug?** - Previous by thread:
**RE: st: Categorizing HIV status using a series of string variables** - Next by thread:
**Re: st: Categorizing HIV status using a series of string variables** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |