Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: working with a 24-character string variable consisting of 0s and 1s

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: working with a 24-character string variable consisting of 0s and 1s Date Tue, 11 Feb 2014 12:34:44 +0000

The idea of a loop is natural here as an alternative to the string
manipulation in my earlier post.

Assuming a variable -x- as set up by Jorge you could do this

gen year1 = 0
gen year2 = 0

qui forval i = 1/12 {
replace year1 = year1 + substr(x, `i', 1) == "1"
replace year2 = year2 + substr(x, 12 + `i', 1) == "1"
}

forv i=1(1)24 {
gen p`i'=substr(x,`i',1)
destring p`i', replace
}

egen year1=rowtotal(p1-p12)
egen year2=rowtotal(p13-p24)
drop p*

As a further detail, -real(substr(x, `i', 1))- removes the need for a
-destring- call.

Nick
njcoxstata@gmail.com

On 11 February 2014 03:08, Jorge Eduardo Pérez Pérez
<jorge_perez@brown.edu> wrote:
> A not very elegant solution: break the variable into pieces.
>
> clear
> input str24 x
> 101011101010001010010011
> 100100010001001001001000
> end
> forv i=1(1)24 {
> gen p`i'=substr(x,`i',1)
> destring p`i', replace
> }
>
> egen year1=rowtotal(p1-p12)
> egen year2=rowtotal(p13-p24)
> gen total=year1 + year2
> drop p*
> li
>
>
> --------------------------------------------
> Jorge Eduardo Pérez Pérez
> Department of Economics
> Brown University
>
>
> On Mon, Feb 10, 2014 at 9:46 PM, Lisa Cook <hlthsrvcsphd@gmail.com> wrote:
>> Hi,
>>
>> I need help working with a cumbersome string variable. I'm using Stata/MP 13.0.
>>
>> I've inherited a dataset that includes several variables indicating
>> the number of months each person had specific kinds of health
>> insurance (Medicaid, Medicare, private, etc.).
>>
>> The variables are 24 characters long in string format. Each character
>> is either a 0 or 1, and represents whether the person had coverage in
>> that month. So, if one of these variables equals
>> "000000000000000000000000", the person had no coverage in any month of
>> that type, while if it equals "111111111111111111111111", they were
>> covered in every month by that kind of insurance. If the variable
>> equals, say, "101111111111111111111111", the person had 23 months of
>> coverage, but no coverage in the 2nd month.
>>
>> I would like to use these variables to generate, for each kind of
>> insurance, the total in year 1, the total in year 2, and the total
>> number of months of coverage in both years.
>>
>> I've used regexm before, but I can't figure out how to apply that code
>> to my situation. I'd be very grateful if anyone could suggest some
>> options.
>>
>> Thanks so much,
>> Lisa
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/