Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Matching in STATA


From   Henry <jakanyada@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Matching in STATA
Date   Sun, 22 Jun 2008 21:07:32 +0100

Many thanks for your suggestions, which will help me a great deal.
Svend, I am using a longitudinal primary care data extracted by Dec 2002.
A case-control study design has been used to identify an association
between an outcome (event) and an exposure.  I would like to control
for potential confounders (age and sex) and thus matching on these
variables.
A 2x2 table is my interest to obtain the Odds ratios. This should give
me the concordant and discordant pairs once the matching is done.
Conditional logistic regression will definitely be the way forward for
our analysis.
My problem can be solved by what listers suggested of using dummy
variables and then matching on age-groups excluding already matched
cases.
I guess this can be possible even when we extend to the 1:n matching
(though still not sure till I try it out)
Another important variable to introduce will be pair-ID for matched pairs.
I had looked at what Maarten suggested but wasn't sure how to
implement the packages and was also reluctant on the –sttocc- given
nature of my data but with the suggestions; I can have a second look.

*************************************************************************************************************************************************
Dear Henry,
Guess I would generate a new dummy variable (for both data sets) for
the case where you want to merge by age-groups and then merge by this
new age-group.
Kind regards,
Andrea
*************************************************************
There are now quite a lot of packages available in this area, see:
-findit match treatment- (and try some other searches with -findit-)
-- Maarten
*************************************************************
I get the impression that data have already been collected, and that
the purpose of matching is to facilitate analysis (at the cost of
dropping some of the control observations). Actually, matching
complicates rather than facilitates analysis in case-control studies;
at least you need to use conditional logistic regression (or -mcc-) to
analyse correctly. So, if my impression is right, the recommendation
is to analyse with -logistic- (or -cc-) including the potential
confounders of interest, but avoiding to match and to remove any of
the control observations. A variable like age could be grouped, e.g.,
in five-year groups.
Anyway, if you want or need to match, the usual way is to categorize
a variable in, e.g., five year groups: 30-34, 35-39, etc. This is
more handy, and it also facilitates reporting the results (you can
stratify by age group).
Hope this helps
Svend
**************************************************************
On Fri, Jun 20, 2008 at 3:48 PM, Salah Mahmud <salah.mahmud@gmail.com> wrote:
> For completeness, also see    [ST] sttocc -- Convert survival-time
> data to case-control data
>
> sttocc automates the process of sampling matched controls. It is
> intended to generate nested case-control data from a cohort data but
> it should not be difficult to "fool" it into sampling from a
> cross-sectional data.
>
> You still need to create the grouped age variable as per above posts.
>
> In my experience, you will require several rounds of matching with
> increasingly permissive age grouping to find matches to all your cases
> unless you have lots of data and only 1 or 2 matching variables. This
> could be implemented within a for loop where each successive loop
> drops and then creates an age grouping variable that is slightly
> cruder than its predecessor.
>
> For instance,
> round                    age group variable
> 1                           agegroup = age  ("exact" matching)
> 2                           agegroup = age collapsed into 2 yrs intervals
> 3                           agegroup = age collapsed into 3 yrs
> intervals and so,
>
> Of course, you will need to exclude any matched cases (and perhaps
> controls) before merging the ummatched cases to the remaining
> controls.
>
>
>
>
>
>
> On Fri, Jun 20, 2008 at 6:25 AM, Svend Juul <SJ@soci.au.dk> wrote:
>>
>> Henry wrote:
>>
>> I would like to carry out some matching for a case-control study using
>> STATA but its proving to be a bit challenging to me. I have checked
>> from achieves but a query close to mine on statlist was not answered
>> in 2004. Could there be a way of matching cases to controls within a
>> range of values say for age, a 40yr old case-patient can be matched to
>> either a 38 or 39 or 40 or 41 or 42yr old control-patient? I have used
>> the -merge- command to merge two datasets by sex and age of patients
>> but it only works for 40yr old case matching a 40yr old control. For
>> this case am still interested in a 1-1  matching but what if I extend
>> this to a 1:n match?  I want to have something of this sort:
>>
>> case-patient  case-age  sex  control-patient  control-age
>>        00b7        35    1             00YP           35
>>        00b7        35    1             0XC1           33
>>        00b7        35    1             0001           36
>>
>> ==================================================================
>>
>> I get the impression that data have already been collected, and that
>> the purpose of matching is to facilitate analysis (at the cost of
>> dropping some of the control observations). Actually, matching
>> complicates rather than facilitates analysis in case-control studies;
>> at least you need to use conditional logistic regression (or -mcc-) to
>> analyse correctly. So, if my impression is right, the recommendation
>> is to analyse with -logistic- (or -cc-) including the potential
>> confounders of interest, but avoiding to match and to remove any of
>> the control observations. A variable like age could be grouped, e.g.,
>> in five-year groups.
>>
>> Anyway, if you want or need to match, the usual way is to categorize
>> a variable in, e.g., five year groups: 30-34, 35-39, etc. This is
>> more handy, and it also facilitates reporting the results (you can
>> stratify by age group).
>>
>> Hope this helps
>> Svend
>>
>>
>> __________________________________________
>>
>> Svend Juul
>> Institut for Folkesundhed, Afdeling for Epidemiologi
>> (Institute of Public Health, Department of Epidemiology)
>> Vennelyst Boulevard 6
>> DK-8000  Aarhus C, Denmark
>> Phone:  +45 8942 6090
>> Home:   +45 8693 7796
>> Email:  sj@soci.au.dk
>> __________________________________________
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index