Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Converting count to dichotomous variable


From   Eric Booth <eric.a.booth@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Converting count to dichotomous variable
Date   Sat, 10 Mar 2012 15:57:40 -0600

On Mar 10, 2012, at 3:05 PM, Shittu, Aminu wrote:
> 
>  Is it possible to force them in a single graph with 7 lines each for a pond, showing monthly totals?




Building on the previous example....

***************!
clear

*--fake data
set obs 7
g var3 = _n
expand 365
sort var3
g var2 = int(30+runiform()*100)
g var1 = int(runiform()*4)
su var*
by var3:  g day = _n

*---create fake month var:
by var3: g month = int(runiform()*12+1)
loc i = 1
foreach m in `c(Mons)' { 
	loc l `l' `i' `"`m'"' 
	loc i `++i'
	} //label.months
lab def jj `l', modify
lab val month jj
ta month //use your real month data for real day counts




*---graph: monthly totals for each pond (var3)

*--1. create monthly totals of mortalities(var1)
bys month var3: egen mt_var1 = total(var1)
lab var mt_var1 "monthly total mortalities for each pond"
bys month: egen mt_var2 = total(var1)
lab var mt_var2 "monthly total mortalities for ALL ponds"
lab var var3 "Pond"


*--2. graphs

**graph macro for options:
loc opts `"name(g1, replace) xtitle(Month) xlabel(#12, labels labsize(small) angle(forty_five) valuelabel) title(Mortalities Totals each Month) scheme(sj) xsize(10) ysize(6)"' //watch for wrapping!!


**a. Overall Totals
twoway (line mt_var2 month,  ///
	 sort  cmissing(y)),  ///
	 subtitle({bf:All Ponds}) `opts' 
    graph save g1 `"overall.gph"', replace

**b. by Pond
 *-build graph command:
	levelsof var3, loc(ponds)
 foreach x in `ponds' {
	loc p  `" `p'  (line mt_var1 month if var3 == `x') "' 
	loc leg `" `leg' `x' `"Pond `x'"' "'
	} //line for each pond
	di `"`leg'"'
twoway `p',  legend(on order(`"`leg'"') size(vsmall)) ///
	subtitle({bf:By Ponds}) `opts'
    graph save g1 `"ByPond.gph"', replace

***************!



- Eric

__
Eric A. Booth
Public Policy Research Institute 
Texas A&M University
ebooth@ppri.tamu.edu
+979.845.6754

On Mar 10, 2012, at 3:05 PM, Shittu, Aminu wrote:

>  Hi Eric and all,
> 
> Thank you very much, it works perfectly!
> 
> May I further ask on how I could plot a single line graph showing the total number of daily or monthly mortality totals in each of the 7 ponds? My hurdle is that mortality was recorded at least everyday for 365 days (365*7), say from 1st - 31st of January.....1st to 31st of December for each of the 7 ponds. If I am to work with the monthly record, I will have 30*12*7 or 31*12*7 rows/pond except for February which has 28 in this case. Is it possible to force them in a single graph with 7 lines each for a pond, showing monthly totals?
> 
> Aminu.
> 
> 
> 
> 
> 
> 
> 
> ________________________________
> From: Eric Booth <eric.a.booth@gmail.com>
> To: statalist@hsphsun2.harvard.edu 
> Sent: Friday, March 9, 2012 10:23:31 PM
> Subject: Re: st: Converting count to dichotomous variable
> 
> 
> 
> <>
> 
> Sorry, your mortalities were in var1, not var2, so the line:
> 
> bys day: egen var4_b = max(var2)
> 
> should have said:
> 
> bys day: egen var4_b = max(var1)
> 
> but the main point is the same.
> 
> EAB
> 
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> ebooth@ppri.tamu.edu
> Office: +979.845.6754
> 
> 
> 
> 
> On Mar 9, 2012, at 3:57 PM, Eric Booth wrote:
> 
> <>
> 
> One strategy is to take the max number of mortalities (via -egen-) across all obs. in each day (or each pond/day -- I'm not sure which you need) and then recode anything greater than zero as a "1" indicating that mortalities occurred (leaving it as zero otherwise).  Here a quick example:
> 
> 
> ***************!
> clear
> 
> *--fake data
> set obs 7
> g var3 = _n
> expand 365
> sort var3
> g var2 = int(30+runiform()*100)
> g var1 = int(runiform()*4)
> su var*
> 
> 
> 
> **var4**
> *--daily mortality
> by var3:  g day = _n
> 
> *---1. for each pond/day
> g var4_a = 1 if var1 != 0 //no mortalities each pond/day
> replace var4_a = 0 if mi(var4_a)
> 
> *---2.  for each day only
> bys day: egen var4_b = max(var2)
> recode var4_b (1/max = 1) (0=0)
> 
> su var4*
> 
> 
> ***************!
> 
> - Eric
> 
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> ebooth@ppri.tamu.edu
> Office: +979.845.6754
> 
> 
> On Mar 9, 2012, at 3:30 PM, Aminu Shittu wrote:
> 
> Dear Statalisters,
> 
> I have var1 in my data set which represents number of death in a fish pond and var2 representing the existing number. The daily mortality was recorded in 7 fish ponds (var3) for a period of 1 year (365x7 rows). I am interested in creating a dichotomous var4, to indicate whether a daily mortality had occurred or not, without taking number of counts into consideration. Is it possible to do this in Stata or Excel?
> 
> Aminu.
> *
> *   For searches and help try:
> *  http://www.stata.com/help.cgi?search
> *  http://www.stata.com/support/statalist/faq
> *  http://www.ats.ucla.edu/stat/stata/
> 
> 
> 
> *
> *   For searches and help try:
> *  http://www.stata.com/help.cgi?search
> *  http://www.stata.com/support/statalist/faq
> *  http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index