[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Categorizing HIV status using a series of string variables

From	"Tom Trikalinos" <[email protected]>
To	[email protected]
Subject	Re: st: Categorizing HIV status using a series of string variables
Date	Mon, 24 Nov 2008 20:17:42 -0500

in Stata 10, using your example below


// incident seroconverters are value 1
. gen group = (regexm(HIV, "NP") | regexm(HIV, "N\.+P"))

// prevalent positives are value 2
. replace group = 2 if !regexm(HIV, "N")

// consistently seronegatives are value 3
. replace group = 3 if !regexm(HIV, "P")



pretty sure this can become more elegant, but this should work.

hth

tom


On Mon, Nov 24, 2008 at 7:45 PM, Polis, Chelsea B. <[email protected]> wrote:
> Dear Statalisters,
>
> I am trying to figure out a way to code individuals as either having incident HIV seroconversion (had at least one negative HIV test, followed by one positive HIV test while under surveillance), prevalent HIV (had one or more positive HIV tests while under surveillance), or HIV-negative (had all HIV-negative tests while under surveillance).
>
> My dataset is set up as such, where N =negative, P=positive, .=not tested at that round, and I "indeterminate".  I want to ignore any indeterminate tests, so I haven't included them here in the examples since I assume I will simply need to replace all "I"s with "."s, but help on figuring out a more elegant way to tweak the code to incorporate this fact would also be most appreciated!
>
> Study_id  HIV1   HIV2   HIV3   HIV4   HIV5   HIV6
> 1         .      N      .      .      N      P
> 2         .      .      N      N      N      .
> 3         P      P      .      .      P      .
> 4         N      P      .      P      P      P
> 5         .      .      .      P      P      P
>
> I also have a variable that shows these patterns in one variable, i.e.
> Study_id    HIV
> 1               .N..NP     (I would want this to be coded as incident seroconverter)
> 2               ..NNN.     (I would want this to be coded as consistently seronegative)
> 3               PP..P.     (I would want this to be coded as prevalent positive)
> 4               NP.PPP     (I would want this to be coded as incident seroconverter)
> 5               ...PPP       (I would want this to be coded as prevalent positive)
>
> These are string variables.  Is there a simple formula to use to categorize these women as incident seroconverters, prevalent positives, or consistently seronegative?
>
> I tried something along the lines of:
> gen prevpos=0
> replace prevpos=1 if hiv1==.|hiv1=="P" & hiv2==.|hiv2=="P" & hiv3==.|hiv3=="P" & hiv4==.|hiv4=="P" & hiv5==.|hiv5=="P" & hiv6==.|hiv6=="P"
>
> But I am receiving type mismatch r(109);
>
> Your suggestions would be most appreciated!
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Categorizing HIV status using a series of string variables
  - From: "Tom Trikalinos" <[email protected]>

References:
- st: Categorizing HIV status using a series of string variables
  - From: "Polis, Chelsea B." <[email protected]>

Prev by Date: st: Categorizing HIV status using a series of string variables
Next by Date: Re: st: Categorizing HIV status using a series of string variables
Previous by thread: st: Categorizing HIV status using a series of string variables
Next by thread: Re: st: Categorizing HIV status using a series of string variables
Index(es):
- Date
- Thread