Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Coding Question


From   Steven Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Coding Question
Date   Sun, 14 Aug 2011 12:22:17 -0400

Even shorter:

*************CODE BEGINS*************
clear
input str8 id meas1 meas2 ///
   meas3 meas4 meas5
"11020189" 0         1         0         1         .
"11030107" 0         1         1         1         .
"11030206" 1         1         1         .         .
"11030207" 0         0         0         .         .
"11030265" 1         1         0         0         0
"11040003" 0         1         1         1         .
"11040079" 1         0         1         0         .
"11040225" 1         0         0         .         .
"11050079" 0         1         1         1         .
"11050203" 1         0         1         1         .
"11060278" 1         0         1         0         .
"11070069" 1         0         0         .         .
"11070195" 1         0         0         0         .
"11070229" 0         1         0         1         .
"11070245" 1         0         0         0         .
end
reshape long meas, i(id) j(order)
bys id: gen convert = meas ==1 & meas[_n-1]==0
bys id: gen revert = meas == 0 & meas[_n-1] ==1
preserve
collapse (sum) n_convert=convert n_revert=revert, by(id)
gen both = n_convert*n_revert >0 
list id n_convert n_revert both
restore
**************CODE ENDS**************

Steve


John Metcalfe wrote:

I have a matrix of serial measurements with some missing values as
below. I would like to write a simple code that counted each
individual as a 'conversion' (going from 0 to 1), a reversion (going
from 1 to 0), or both. Any ideas appreciated.


Here's an approach using -reshape-.  I've shortened Joseph's names for easier typing.

Steve

*************CODE BEGINS*************
clear
input str8 lid meas1 meas2 ///
  meas3 meas4 meas5
"11020189" 0         1         0         1         .
"11030107" 0         1         1         1         .
"11030206" 1         1         1         .         .
"11030207" 0         0         0         .         .
"11030265" 1         1         0         0         0
"11040003" 0         1         1         1         .
"11040079" 1         0         1         0         .
"11040225" 1         0         0         .         .
"11050079" 0         1         1         1         .
"11050203" 1         0         1         1         .
"11060278" 1         0         1         0         .
"11070069" 1         0         0         .         .
"11070195" 1         0         0         0         .
"11070229" 0         1         0         1         .
"11070245" 1         0         0         0         .
end
gen id = _n
reshape long meas, i(id) j(order)
bys id: gen convert = meas ==1 & meas[_n-1]==0
bys id: gen revert = meas == 0 & meas[_n-1] ==1
egen n_convert = total(convert), by(id)
egen n_revert = total(revert), by(id)

preserve
bys id: keep if _n==1
gen both = n_convert*n_revert >0 
list id n_convert n_revert both
**************CODE ENDS**************


> On Aug 14, 2011, at 10:15 AM, Joseph Coveney wrote:
> 
> --------------------------------------------------------------------------------
> 
> Something like that below would work.  I'm assuming that rows are individuals
> and columns are serial measurements.  (You didn't describe or show your variable
> names.)
> 
> Joseph Coveney
> 
> input str8 individual_id serial_measurement1 serial_measurement2 ///
>   serial_measurement3 serial_measurement4 serial_measurement5
> "11020189" 0         1         0         1         .
> "11030107" 0         1         1         1         .
> "11030206" 1         1         1         .         .
> "11030207" 0         0         0         .         .
> "11030265" 1         1         0         0         0
> "11040003" 0         1         1         1         .
> "11040079" 1         0         1         0         .
> "11040225" 1         0         0         .         .
> "11050079" 0         1         1         1         .
> "11050203" 1         0         1         1         .
> "11060278" 1         0         1         0         .
> "11070069" 1         0         0         .         .
> "11070195" 1         0         0         0         .
> "11070229" 0         1         0         1         .
> "11070245" 1         0         0         0         .
> end
> 
> local deltas
> forvalues interval = 2/5 {
>   generate int delta`interval' = ///
>       serial_measurement`interval' - ///
>       serial_measurement`=`interval'-1'
> 	replace delta`interval' = 0 if missing(delta`interval')
> 	local deltas `deltas' delta`interval'
> }
> local deltas : subinstr local deltas " " ", ", all
> generate int conversion = max(`deltas')
> replace conversion = 0 < conversion 
> generate int reversion = min(`deltas')
> replace reversion = reversion < 0
> generate byte both = conversion * reversion
> local line_size `c(linesize)'
> set linesize 200
> list, noobs separator(0) abbreviate(20)
> set linesize `line_size'
> exit
> 
> 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index