Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Generating days eligible when eligibility changes over time |
Date | Fri, 7 Mar 2014 13:02:47 +0000 |
This is an interesting problem. I haven't tried understanding your code exactly, but note that you seem to be confusing the -if- command and the -if- qualifier, perhaps due to some previous or concurrent life as a SAS user. 0. I don't think I understood about the "survey date" and when that was. 1. In your sample data, birthdays often look off, so that (e.g.) the first two individuals capriciously celebrate their birthdays a day earlier than usual in some or all cases. 06sep1995 05sep2002 05sep2004 05sep2006 06dec1994 05dec2001 06dec2003 05dec2005 Birthdays in my view are better calculated by Stata. I know of one predictable problem and presume that people born on 29 February are deemed to celebrate their birthday on 28 February when it's not a leap year. 2. The exact rule could be that eligibility ends (a) on a particular birthday or (b) on the day before, but I'll let you worry about that and add or subtract 1 according to whether code is correct or off. This code gives some technique that may help. clear input str9 birthday 01jan1950 01jan1995 01apr1995 01jan1997 01apr1999 31dec1991 end gen dbirthday = date(birthday, "DMY") local start1 = date("01jan1998", "DMY") local limit1 = 7 local start2 = date("01apr2003", "DMY") local limit2 = 9 local start3 = date("01jan2004", "DMY") local limit3 = 11 local start4 = date("01jan2005", "DMY") local limit4 = 14 local start5 = date("01jan2008", "DMY") local limit5 = 15 local start6 = date("01jan2010", "DMY") local limit6 = 18 gen limitdate = . gen eligible = 0 qui forval j = 1/6 { replace limitdate = mdy(month(dbirthday), day(dbirthday), year(dbirthday) + `limit`j'') replace limitdate = mdy(2, 28, year(dbirthday)) if missing(limitdate) & !missing(dbirthday) replace eligible = eligible + max(0, limitdate - `start`j'') } list , sep(0) I would always recommend working with a test dataset for which the correct answers can be calculated independently. Nick njcoxstata@gmail.com On 7 March 2014 00:19, Brill, Robert <robertbrill@austin.utexas.edu> wrote: > Hi Listers, > > Iım working on a project in which I am generating an ³intended dosage² > variable equal to the number of days that a person was > eligible for a cash grant program. In short, Iıd like to know how many > days a child was eligible for the program between the program start and > the survey date. Eligibility is based on being under an age cutoff. The > government in question, however, changed the eligibility age 4 times over > the course of the period of interest (1998-2010). > > The program began on 01jan1998, and was available to children under 7. On > 01apr2003 the eligibility was increased to 9, on 01jan2004 to 11, on > 01jan2005 to 14, on 01jan2008 to 15, and on 01jan2010 to 18. > > Therefore, if an individual turned 7 after the first eligibility change > occurred, she was eligible from the beginning of the program in 1998 until > the date of the survey in 2010. However, if she turned 7 before the > change, she is only eligible for the number of days from the beginning of > the program until the date she turned 7. Itıs also possible that a > different child with a different birthday would fluctuate back and forth > in eligibility as the changes occurred (depending on her birthday, > obviously). > > To generalize to all cases, Iıve considered generating a dose variable for > each of the periods (1998-2003, 2003-2004, 2004-2005, 2005-2008, and > 2008-2010) which uses if and if else statements to take all of the types > of cases into account when subtracting birthdays and eligibility change > days etc., or, alternatively, generating a couple hundred month indicators > (or a few thousand days) that are 1 if an individual was eligible in that > month (or day), then calculating the row total to tell me the number of > months (or days) eligible. > > However, Iıd much prefer to have a fully accurate measure to the day, and > it seems a bit too crazy to generate all of those indicators when thereıs > likely another way, but I havenıt been able to think my way out of my > current funk. > > The variable ³birthday² is a %td formatted individual birthday. ³turned7² > is the date that the individual turned 7 (and exists for each age, > turned9, etc.). > > Right now, the code looks like: > > > ** Input some example data > input str9 birthday str9 turned7 str9 turned9 str9 turned11 > 06sep1995 05sep2002 05sep2004 05sep2006 > 06dec1994 05dec2001 06dec2003 05dec2005 > 23oct1995 22oct2002 22oct2004 22oct2006 > 01dec1994 30nov2001 01dec2003 30nov2005 > 08sep1994 07sep2001 08sep2003 07sep2005 > 26jun1995 25jun2002 25jun2004 25jun2006 > 10nov1995 09nov2002 09nov2004 09nov2006 > 12feb1994 11feb2001 12feb2003 11feb2005 > 16nov1995 15nov2002 15nov2004 15nov2006 > 10feb1994 09feb2001 10feb2003 09feb2005 > 01sep1994 31aug2001 01sep2003 31aug2005 > 26oct1994 25oct2001 26oct2003 25oct2005 > 21nov1994 20nov2001 21nov2003 20nov2005 > 03nov1994 02nov2001 03nov2003 02nov2005 > 10mar1994 09mar2001 10mar2003 09mar2005 > 31oct1994 30oct2001 31oct2003 30oct2005 > 08aug1994 07aug2001 08aug2003 07aug2005 > 21dec1994 20dec2001 21dec2003 20dec2005 > 10may1994 09may2001 10may2003 09may2005 > 19apr1994 18apr2001 19apr2003 18apr2005 > 05nov1994 04nov2001 05nov2003 04nov2005 > 18dec1995 17dec2002 17dec2004 17dec2006 > 24may1995 23may2002 23may2004 23may2006 > 15aug1994 14aug2001 15aug2003 14aug2005 > 05apr1995 04apr2002 04apr2004 04apr2006 > end > > ** format inputted data > foreach v of varlist *{ > g td`v'=date(`v', "DMY") > drop `v' > rename td`v' `v' > format `v' %td > } > > > **generate dose by period vars > > forval i=1/5 { > g p`i'dose=. > } > > > ** use a bunch of if/else statements to generate dosage by period to be > summed later > foreach e in 7 { > local start "mdy(01,01,1998)" > local p1 "mdy(04,01,2003)" > local p2 "mdy(01,01,2004)" > local p3 "mdy(01,01,2005)" > local p4 "mdy(01,01,2008)" > local p5 "mdy(01,01,2010)" > > if turned7>`p1' { > replace p1dose=`p1'-birthday > } > > else if turned7<`p1' & birthday>`start' { > replace p1dose=turned7-birthday > } > > else if turned7<`p1' & birthday<`start' { > replace p1dose=turned7-`start' > } > > if turned9>`p2ı & turned7>`p1ı { > replace p2dose=`p2ı-`p1ı > } > > else if turned9<`p2ı & turned9>`p1ı { > replace p2dose=turned9-`p1ı > } > ** etc etc etc > > > > Or alternatively, the code could look like > > > forval m=1/12 { > > forval y=1998/2012 { > > if `yı<2003 { //this has a problem because of the april thing but you > get the idea > g drop`mı`yı=(((mdy(`mı,01,`yı)-birthday)/365.25)<7) > } > > else if `yı >2002 & `yı<2005 { > ** etc etc etc > } > > > > While Iım not entirely opposed to doing one of these ways, Iım wondering > if there is some obvious or less-obvious way to go through this in a more > efficient way or that perhaps requires less opaque code. > > Any hints, tips, or tricks are very welcome and will be received with > extreme gratitude, Iıve been banging my head against a wall on this for a > while now and would love to hear new perspectives on the problem. > > > Thanks greatly for your time, and for all the great help that Iıve gotten > in the past by searching the archive. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/