Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Coding Question

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Re: Coding Question
Date	Sun, 14 Aug 2011 12:22:17 -0400

Even shorter:

*************CODE BEGINS*************
clear
input str8 id meas1 meas2 ///
   meas3 meas4 meas5
"11020189" 0         1         0         1         .
"11030107" 0         1         1         1         .
"11030206" 1         1         1         .         .
"11030207" 0         0         0         .         .
"11030265" 1         1         0         0         0
"11040003" 0         1         1         1         .
"11040079" 1         0         1         0         .
"11040225" 1         0         0         .         .
"11050079" 0         1         1         1         .
"11050203" 1         0         1         1         .
"11060278" 1         0         1         0         .
"11070069" 1         0         0         .         .
"11070195" 1         0         0         0         .
"11070229" 0         1         0         1         .
"11070245" 1         0         0         0         .
end
reshape long meas, i(id) j(order)
bys id: gen convert = meas ==1 & meas[_n-1]==0
bys id: gen revert = meas == 0 & meas[_n-1] ==1
preserve
collapse (sum) n_convert=convert n_revert=revert, by(id)
gen both = n_convert*n_revert >0 
list id n_convert n_revert both
restore
**************CODE ENDS**************

Steve


John Metcalfe wrote:

I have a matrix of serial measurements with some missing values as
below. I would like to write a simple code that counted each
individual as a 'conversion' (going from 0 to 1), a reversion (going
from 1 to 0), or both. Any ideas appreciated.


Here's an approach using -reshape-.  I've shortened Joseph's names for easier typing.

Steve

*************CODE BEGINS*************
clear
input str8 lid meas1 meas2 ///
  meas3 meas4 meas5
"11020189" 0         1         0         1         .
"11030107" 0         1         1         1         .
"11030206" 1         1         1         .         .
"11030207" 0         0         0         .         .
"11030265" 1         1         0         0         0
"11040003" 0         1         1         1         .
"11040079" 1         0         1         0         .
"11040225" 1         0         0         .         .
"11050079" 0         1         1         1         .
"11050203" 1         0         1         1         .
"11060278" 1         0         1         0         .
"11070069" 1         0         0         .         .
"11070195" 1         0         0         0         .
"11070229" 0         1         0         1         .
"11070245" 1         0         0         0         .
end
gen id = _n
reshape long meas, i(id) j(order)
bys id: gen convert = meas ==1 & meas[_n-1]==0
bys id: gen revert = meas == 0 & meas[_n-1] ==1
egen n_convert = total(convert), by(id)
egen n_revert = total(revert), by(id)

preserve
bys id: keep if _n==1
gen both = n_convert*n_revert >0 
list id n_convert n_revert both
**************CODE ENDS**************


> On Aug 14, 2011, at 10:15 AM, Joseph Coveney wrote:
> 
> --------------------------------------------------------------------------------
> 
> Something like that below would work.  I'm assuming that rows are individuals
> and columns are serial measurements.  (You didn't describe or show your variable
> names.)
> 
> Joseph Coveney
> 
> input str8 individual_id serial_measurement1 serial_measurement2 ///
>   serial_measurement3 serial_measurement4 serial_measurement5
> "11020189" 0         1         0         1         .
> "11030107" 0         1         1         1         .
> "11030206" 1         1         1         .         .
> "11030207" 0         0         0         .         .
> "11030265" 1         1         0         0         0
> "11040003" 0         1         1         1         .
> "11040079" 1         0         1         0         .
> "11040225" 1         0         0         .         .
> "11050079" 0         1         1         1         .
> "11050203" 1         0         1         1         .
> "11060278" 1         0         1         0         .
> "11070069" 1         0         0         .         .
> "11070195" 1         0         0         0         .
> "11070229" 0         1         0         1         .
> "11070245" 1         0         0         0         .
> end
> 
> local deltas
> forvalues interval = 2/5 {
>   generate int delta`interval' = ///
>       serial_measurement`interval' - ///
>       serial_measurement`=`interval'-1'
> 	replace delta`interval' = 0 if missing(delta`interval')
> 	local deltas `deltas' delta`interval'
> }
> local deltas : subinstr local deltas " " ", ", all
> generate int conversion = max(`deltas')
> replace conversion = 0 < conversion 
> generate int reversion = min(`deltas')
> replace reversion = reversion < 0
> generate byte both = conversion * reversion
> local line_size `c(linesize)'
> set linesize 200
> list, noobs separator(0) abbreviate(20)
> set linesize `line_size'
> exit
> 
> 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Re: Coding Question
  - From: Phil Schumm <[email protected]>

References:
- st: Coding Question
  - From: john metcalfe <[email protected]>
- st: Re: Coding Question
  - From: "Joseph Coveney" <[email protected]>

Prev by Date: Re: st: Re: Coding Question
Next by Date: st: Aggregating medians in a group (median county income over metro statistical areas)
Previous by thread: Re: st: Re: Coding Question
Next by thread: Re: st: Re: Coding Question
Index(es):
- Date
- Thread