# RE: st: Match two samples in stata

 Thu, 26 Jul 2012 13:38:16 -0700

```I don't understand why you need a comparison group.
If all you want to do is predict which companies issue bonds, you just need
to run the regression.  You'll want to control for size and industry in that
model.

Let's assume for a moment that you're just regressing whether a bond was
issued on size and industry.  If you constrain the sample of non-issuers to
have the same distribution of those variables as the sample of issuers how
does that help you?  You won't be able to determine whether companies in
certain industries were more likely to issue bonds or whether large
companies were more likely to issue bonds because you've already constrained
the distributions of those two variables to be equal across your two outcome
categories.  Why would you want to do that?

-Sarah

Hi Sarah,

Many thanks for your great help.

Actually, I am trying to define a comparison group for a particular sample
who issues bond. In particular, I want to match non-issuers (comparison
sample) with issuers (test sample) based on size and industry to find the
probability of issuing bonds. Simply, I want to run a logit model to find
the probability of issuing bonds, but need to define a comparison group
before running the regression. If this is clear, please let me know your
suggestion and which command would be useful to define a comparison sample.
I do not need to use PSM.

Best,
Eln

Eln,
I think you need to define what you mean by "matching" in this context.  As
you've noted, you've received some answers about propensity score matching
in particular.  That's one strategy for matching samples.  It's not the only
one, though.
The problem seems to be that you haven't made it clear what your end goal
is.  Are you trying to define a comparison group for a particular sample?
How to do the matching really depends on what you're going to do with the
information afterward.  From some of your previous posts it sounds like
whether a company issued a bond is the outcome you're interested in
studying.  If that's the case you don't want to match the bond issuers with
non-bond issuers, you want to model the process.  Probably by running a
logit (or probit) model as suggested in previous threads.  In that case you
would be controlling for size and industry but there's no matching involved.

If you define your research question and what strategy you want to use to
answer it this list may be able to help you implement that strategy.  But so
far you haven't done that, and you can't really expect people to magically
know what kind of matching you want to do or what your data needs to look
like to meet your goals.

I could certainly tell you how to write code that would take your sample A
and select from sample B the first firm of the same size and industry.
That's unlikely to actually be useful for the vast majority of research
questions and if whether a company issues bonds is the outcome you want to
investigate then it's almost certainly the wrong strategy completely.
Nonetheless, it can be easily enough done.  You'd still have to deal
questions like:  what do you do when a firm in sample A doesn't have an
exact match in sample B?  That complication is one of the reasons strategies
like propensity score matching tend to be popular.

If you're unclear about what your question is and what kind of analysis you
want to do you will probably benefit a great deal from figuring that out
before you tackle the question of how to write the relevant Stata code.

-Sarah

Hi Ronnie,

The provided answers were for Propensity Score Matching, but now I am
looking at very simple matching. I appreciate if you could forward any
previous answer related to this in the case if it was ignored.

Best

Eln,

Others have provided useful answers to you questions about matching yet
somehow I feel that you have ignored these answers. Did you for example look
at the references Adam shared?

Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the
implementation of propensity score matching. Journal of Economic Surveys,
22, 31-72.

Stuart, E. A. (2010). Matching methods for causal inference: A review and a
look forward. Statistical Science, 25, 1-21.

Ronnie

> Hi all,
>
> I have two samples and want to match them with their size and industry. Is
there anyway to do it in stata?
>
> Best,
> Eln
>
>
