Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Statistical Matching

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: Statistical Matching
Date	Tue, 19 Jul 2011 12:38:43 -0400

Gillette, Ryan (Volunteer) <[email protected]>:
You can still use propensity scores, defining dataset 1 as T=0 and
dataset 2 as T=1 and e.g. running a logit of T on X in the appended
datasets.  Without more detail, it is hard to offer specific advice.
No user-written software is required, but there is much available to
download.  You can define a multivariate distance metric and get the
minimum-distance observations as matches, or you can do exact matching
by simply sorting appropriately, resampling with replacement the
appropriate number of times to achieve identical marginal
distributions, and then doing an unmatched -merge-.  This is
especially easy if you have weights in each dataset that sum to the
same population total.  N.B. the -sort- can be used to match on one
continuous variable by rank within categories of discrete variables.

On Tue, Jul 19, 2011 at 12:26 PM, Gillette, Ryan (Volunteer)
<[email protected]> wrote:
> Hello,
>
> I am trying to match comparable observations between two large datasets (300,000 to 3 million observations, depending which ones I decide to use). I am not trying to calculate a treatment effect, but rather identify the id number or observation number of an observation's closest match. I am matching across a few variables, some of which I want to weight more than others in  terms of required precision. I don't think I will be able to use a propensity score, as it doesn't seem appropriate for my task.
>
> Does anyone know a program in Stata that can do these things? I have used -nnmatch- before, but with such a large dataset I worry it could take days to process. Is there a way to speed it up? Any ideas would be much appreciated!
>
> Thanks,
>
> Ryan

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Statistical Matching
  - From: "Gillette, Ryan (Volunteer)" <[email protected]>

Prev by Date: st: Statistical Matching
Next by Date: st: Date: Tue, 19 Jul 2011 17:47:23 +0000
Previous by thread: st: Statistical Matching
Next by thread: Re: st: Statistical Matching
Index(es):
- Date
- Thread