# Re: st: Re: RE: CASE-CONTROL STUDY

 From David Airey To statalist@hsphsun2.harvard.edu Subject Re: st: Re: RE: CASE-CONTROL STUDY Date Sun, 15 Mar 2009 12:12:57 -0500

```.

There is at that URL, optmatch2.ado and optmatch2.hlp.

```
You can save these files where Stata wants them, and then they will be available to you at the Stata command line. There won't be any menus.
```
```
You might also track don't the author. He has offered email assistance using this program.
```
-Dave

----------------------------------------------------------------------------------------
help for optmatch2
----------------------------------------------------------------------------------------

Optimal Matching

Syntax

optmatch2 casecontrol varlist [if] [in] [, options ]

options               description
```
----------------------------------------------------------------------------------
```    Main
minc(#)              Minimum control:case ratio.
maxc(#)              Maximum number of controls per set.
```
nc(#) Total number of controls to include in match. gen(newvar) A new variable to contain the number of the case-control
```                            set each subject belongs to.
caliper(#)           Limit on acceptable matching.
measure(string)      Type of dissimilarity measure to use.
epsilon(#)           Stability constant.
```
repeat If requested number of controls cannot be matched, produce
```                            match with as many controls as possible.
```
----------------------------------------------------------------------------------
```
Description

```
The command optmatch2 performs optimal matching using the network flow methodology outlined in Rosenbaum(1989). The variable casecontrol contains 1 for cases and 0 for controls. The variable(s) on which matching is to be performed are given by varlist. If there is more than one variable in varlist, there are a number of ways of calculating a distance between a case and a control: see the option measure
```    below for more information.

Options        +------+
```
----+ Main +---------------------------------------------------------------
```
```
minc(#) Minimum control:case ratio. May be less than 1: e.g. 0.5 means the same
```        control can be mapped to 2 cases. Default value is 1.
```
maxc(#) Maximum number of controls per case-control set. Must be an integer >= 1:
```        default value is 1
```
nc(#) Total number of controls to be used in the match. Defaults to the all controls in the dataset. Can be set to any integer less than or equal to this: requesting more controls than exist in the dataset will cause optmatch2 to
```        fail with an error message.
```
gen(newvar) If given, this will create a new variable containing an identifier for the case-control set this individual belongs to. If it is not given, a variable called set is created, unless it already exists in which case
```        optmatch2 will fail with an error message.
```
caliper(#) This sets the maximum allowable discrepancy between a case and a control within a matched set. By default, no caliper is set and every control
```        can, in theory, be matched to any case.
```
measure(string) This is only of importance if there are more than one variable in varlist. In this case, it determines the metric to use when converting differences in several variables to one overall dissimilarity measure. The standard measures that stata can use are outline in measure_option. Of these, optmatch2 can use L(#), Lpower(#) and Linfinity and their various aliases, with the default being L2. In addition, it can accept a value mahal to use the
```        Mahalonobis distance.
```
epsilon(#) Default value is 0.000001. Technically, the optimal matching method only works if all discrepancies between cases and controls are greater than zero. This value is added to all discrepancies to ensure that this is the case. The value of epsilon can affect the matching if (opt minc} < 1: see
```        Hansen and Klopfer (2006) for a discussion of this.
```
repeat It may be impossible for optmatch2 to find a matching that matches the requested number of controls (nc). This may be a logical impossibility (there are not that many controls in the data) or an empirical one (if you use caliper to define the maximum allowable discrepancy in a match, it may not be possible to match all controls to a case). If you give the repeat option, it will report how many controls it can match, then perform the matching with that number of controls. Otherwise, it will simply report the maximum number
```        of controls it could match.

Remarks

```
The command optmatch2 produces matched sets, that is groups consisting of one or more cases and one or more controls, with the dissimilarities between subjects in a set being as small as possible. By default, it produces matched pairs (1 case and 1 control), but this can be changed using the options minc, maxc and nc. For example, minc(1) maxc(1) will produce the default 1 to 1 matching, whilst minc(3) maxc(3) will produce sets which all consist of 1 case and 3 controls.
```
```
More complex matchings can be achieved by using values of minc less than 1. For example, minc(0.5) maxc(2) will produce sets consisting of either 1 control and 2
```    cases, 1 control and 1 case or 2 controls and 1 case.

References

```
Ben B. Hansen and Stephanie Olsen Klopfer: "Optimal Full Matching and Related Designs via Network Flows" (2006) Journal of Computational and Graphical
```        Statistics 15(3):  609-627.

```
Paul R. Rosenbaum "Optimal Matching for Observational Studies" (1989) JASA
```        84(408): 1024-1302.

Author

Mark Lunt, ARC Epidemiology Unit

The University of Manchester

```
Please email mark.lunt@manchester.ac.uk if you encounter problems with this
```    program

On Mar 15, 2009, at 11:52 AM, Ishay Barat wrote:

```
```Dear Kieran and David

```
There can be lots of arguments why one designs backwards study and not forward one. In my case, I am responsible for a lot of patients, going through my department, and need form time to time to have quality control of our patients management. It would have been nice to have a quarter of a million \$ and 3 years time to carry on a study, but that's not reality.
```

Sorry.

```
As my objective is geriatric patients, and my data includes general inter medicine ward cliental I like to reduce the noise younger and far healthier patients introduces into my data.
```
```
By matching some crucial parameters like age, sex, medication and disease, I may get answers to my questions.
```

As to my anagram. It is just for fun and nothing else.

As to the reference to http://personalpages.manchester.ac.uk/staff/mark.lunt/optmatch.html

I installed the files, but can not find the command in the menus.

*¸..· ´¨)) -:¦:-        *
¸.·´ .
(( -:¦:- * Ishay *  -:¦:-
´·..          ..·´
((¸¸.·´* -:¦:-

_________________________________________________________-

```
Matching is an element of the design of a study, planned before the data is collected, and should be done for efficiency, not control. If you already have the data, you gain nothing by matching. You have a sample size of 2,500. If you match these data in the way you have indicated, you will end up with a matched sample size of 1,200. Why would you want to discard over half of your data?
```
```
You should analyse the data as they are and control for age, sex, etc in the analysis.
```

______________________________________________
Kieran McCaul MPH PhD
WA Centre for Health & Ageing (M573)
University of Western Australia
Level 6, Ainslie House
48 Murray St
Perth 6000
Phone: (08) 9224-2140
Fax: (08) 9224 8009
email: kamccaul@meddent.uwa.edu.au
http://myprofile.cos.com/mccaul
http://www.researcherid.com/rid/B-8751-2008
______________________________________________
```
Epidemiology is so beautiful and provides such an important perspective on human life and death, but an incredible amount of rubbish is published. Richard Peto (2007)
```
-----Original Message-----
```
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu ] On Behalf Of Ishay Barat
```Sent: Sunday, 15 March 2009 1:28 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: CASE-CONTROL STUDY

HELLO

```
I've got a data set containing about 2500 patients, of which 300 have my
```interest (Group A).

```
I would like to extract a sample of 900 patients (Group B) out of the data set that match Group A in age, sex and some other parameters. A Classical
```Case-Control study with 3 controllers for each case.

Is anybody have a clue how the syntax look like??

*¸..· ´¨)) -:¦:-        *
¸.·´ .
(( -:¦:- * Ishay *  -:¦:-
´·..          ..·´
((¸¸.·´* -:¦:-

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

--
No virus found in this incoming message.
Checked by AVG.
```
Version: 7.5.557 / Virus Database: 270.11.13 - Release Date: 13-03-2009 00:00
```
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```