[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Kieran McCaul" <Kieran.McCaul@uwa.edu.au> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Finding controls within 5 years of age of a case |

Date |
Thu, 1 Oct 2009 07:00:18 +0800 |

... I can't help with the code for this, but I believe that it's called calliper matching and this form of age matching can induce some bias in a study. The age distribution in the population is not uniform. This means that for a particular age, the proportion of the population within 5 years above that age will be different to the proportion who are within 5 years younger than that age. For example, suppose a case is 45 years old and the proportion of the population aged 45 to 49 is greater than the proportion of the population that is aged 40 to 45. If you calliper match a large number of cases aged in their 40s, the controls will tend, on average, to be older than the cases. If you match on 5-year age groups, this doesn't happen. I haven't got the book with me at the moment, but I think that Rothman and Greenland discuss this in Modern Epidemiology. ______________________________________________ Kieran McCaul MPH PhD WA Centre for Health & Ageing (M573) University of Western Australia Level 6, Ainslie House 48 Murray St Perth 6000 Phone: (08) 9224-2701 Fax: (08) 9224 8009 email: Kieran.McCaul@uwa.edu.au http://myprofile.cos.com/mccaul http://www.researcherid.com/rid/B-8751-2008 ______________________________________________ If you live to be one hundred, you've got it made. Very few people die past that age - George Burns -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Michael McCulloch Sent: Wednesday, 30 September 2009 11:40 PM To: statalist@hsphsun2.harvard.edu Subject: st: Finding controls within 5 years of age of a case Hello Statalist friends, I have a list of cases and controls uniquely identified by an ID variable. I'd like to list all controls within 5 years of age of each case, and then mark those controls with a new variable identifying the match. This creates, in effect, a cluster that will be accounted for in analysis. Is there an efficient way to create that matching variable? Michael McCulloch 124 Pine St San Anselmo, CA 94960 Tel. 415-407-1357 Fax 206-338-2391 mm@pinestreetfoundation.org On Sep 30, 2009, at 8:27 AM, "Brian R. Landy" <landy@alumni.caltech.edu> wrote: Hi, I'm not really sure what your question is, but I'm guessing you find rolling: to be slow with a panel? I observed this a while back (and did report to Stata but have never seen notice that it was fixed), I found that -rolling- in conjunction with panels is far slower than the time implied by (# panels)*(time for rolling regression on just one panel). In my case a regression was taking over 1 hour on a 4 CPU box, this was for somewhere around 100 panels, 4 years of daily data, and a 2 year rolling regression. My workaround was to use foreach to loop over the panels, saving and merging the results of each somewhat like this: // prep data tsset id date gen end=date // for later merging tempfile stats levelsof id, local(ids) foreach id of local ids { keep if id==`id' quietly: rolling, window(`window') saving(`stats', replace) /// nodots: regress y x merge id end using "`stats'", sort update replace nokeep drop _merge } This took my 1+ hour runtime down to just a few minutes. Regards, Brian Quoting Degas Wright <degasw@decaturcapital.com>: I have a longitudinal dataset that has 2000 stocks as xticker (id) and dependent variable, return (t+1), with 20 independent variables (t) over 88 periods (months). I am trying to run a , xtreg, regression over three periods and then use the coefficients from the regression to forecast the t+1 return. When I use the following command: .. rolling _b _se, window (3) clear: xtreg return, var1, var2,.var20, vce(cluster xticker) (running regress on estimation sample) -> xticker = 1 Rolling replications (86) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .......... -> xticker = 2 Rolling replications (86) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 ......... It starts going through each of the 2000 stocks, by listing xticker1, xticker 2, etc.. I have stopped it prior to the run being completed because it will take a long time to go through all 2000 stocks. Is there another command that I should be using? For instance I use the forvalues command to run the regression, xtreg, one period at a time for all of the periods, Period 1, Period 2, etc. Thank you for your assistance. Degas A. Wright, CFA Chief Investment Officer Decatur Capital Management, Inc. 250 East Ponce De Leon Avenue, Suite 325 Decatur, Georgia 30030 Voice: 404.270.9838 Fax:404.270.9840 Website: www.decaturcapital.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Finding controls within 5 years of age of a case***From:*Michael McCulloch <gmmt@sbcglobal.net>

- Prev by Date:
**st: re: one-sided p-value using test x1=x2** - Next by Date:
**st: Why include leads (in a model with lags)?** - Previous by thread:
**st: Finding controls within 5 years of age of a case** - Next by thread:
**st: Looping over values of a string variable to create twoways** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |