Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Matching samples in Stata

From	Paula Arce <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Matching samples in Stata
Date	Thu, 11 Oct 2012 18:40:17 +0100 (BST)

HI David, 

I finally got round to matching my sample.  I match the two samples on family education level and gender

mahapick ed_level_fam sex, idvar( "ID") genfile(D:\matched) nummatches(4) full treated(course)

where course is 1 for medicine and 0 for other - as in my analyses I want to compare medicine students vs. the others.  I created a file 'matched' as I intend to import the relevant variables into it so that I can just run the analyses for this.

Ideally I want to only keep the first match.

However, when I check for duplicates using 

duplicates list ID

I find that many of the matched respondents are the same for different medicine students. 

Can you suggest what I am doing wrong and any way around this pls?

Thanks,
Paula

----- Original Message -----
From: David Kantor <[email protected]>
To: [email protected]
Cc: 
Sent: Wednesday, 3 October 2012, 16:29
Subject: Re: st: Matching samples in Stata

Hello Paula,

At 07:29 AM 10/3/2012, you wrote:
> Thanks David,
> 
> mahapick is very user-friendly; what's the main difference between mahapick and psmatch2? or are they pretty much equivalent?
> [...]

I actually have never used psmatch2 or psmatch, though I have tried to read through one or the other on some occasions (and borrowed a bit).
I don't really know much about what it does, but my impression is that, in comparison to mahapick, it...
a, has several different options and constraints for the distance measure, in addition to Mahalanobis;
b, can do a selection of unique matches using a randomized selection order;
c, can perform various analyses on the resulting matching -- whereas mahapick just gets you the matching.

I believe that if you specify psmatch2 with a mahalanobis distance, you should get the same distance measure as you would in mahapick.

In my own usage of mahapick, I had sometimes done a randomized selection, but in a subsequent separate procedure (that I have not made into a publishable program).

Thanks for saying that mahapick is user-friendly. I often worry that there are too many options to keep track of -- including one that is a vestige of its first incarnation, which I would not advise using.

It may be helpful to know that the mahapick suite has several other programs for just obtaining the distance measure:
        mahascore: generates the distance between every observation and one specific point or observation;
        mahascores: generates the distance between every pair of observations (or possibly a limited set of pairs);
        mahascore2: computes the (single) distance between two specified points or the centroids of specified populations.

HTH
--David

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?search
*  http://www.stata.com/support/faqs/resources/statalist-faq/
*  http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Matching samples in Stata
  - From: Paula Arce <[email protected]>
- Re: st: Matching samples in Stata
  - From: David Kantor <[email protected]>
- Re: st: Matching samples in Stata
  - From: Paula Arce <[email protected]>
- Re: st: Matching samples in Stata
  - From: David Kantor <[email protected]>

Prev by Date: st: two selection equations followed by mlogit?
Next by Date: Re: st: Sums and means for each decile
Previous by thread: Re: st: Matching samples in Stata
Next by thread: Re: st: Matching samples in Stata
Index(es):
- Date
- Thread