Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Linking a patient file with an event file

From   "Sarah Edgington" <>
To   <>
Subject   st: RE: Linking a patient file with an event file
Date   Thu, 28 Feb 2013 10:57:05 -0800

Unless there's something I'm misunderstanding about your file structure you
should be able to just merge the two files as is with a m:1 (many to one)
This won't work if you have multiple lines of data per person in your
patient file, but nothing you've said suggests that's true.
I think you ought to be able to do something along the lines of:
use eventfile
merge m:1 patientid using patientfile

You'll probably then want to drop any of the records from the using file
only (i.e. those for patients that do not have emergency room visits).

Unless you have some compelling reason to want your event file in long
format (and it doesn't sound like you do) then you should just merge the
data in its current form.


-----Original Message-----
[] On Behalf Of Health Services
Sent: Thursday, February 28, 2013 9:16 AM
Subject: st: Linking a patient file with an event file

I want to link/merge several files, but my data are complicated and I'm
unsure how to proceed. I'm using Stata 11.2 for Windows.

My patient file contains sociodemographic information for ~60,000 patients.
My event file contains information about ~20,000 visits to emergency rooms.
There is a patient ID variable in each file. Many patients in my patient
file will not be in my event file (because they didn't visit the emergency
room). My event file has a patient ID, service date, and various other
variables related to the visit.

I need to merge the files so I can analyze incidence and run regression
models. I've read lots about merging files, but haven't seen anything that
addresses my issue yet.

I assume I will need to reshape my event file from long to wide so that
there is one row per patient, but everything I read about reshaping assumes
that the multiple observations per patient are identified somehow. So one
thing I *think* I need to do is generate a new variable that counts ED
visits for each patient. I thought I might be able to do something like
this: (from

by id2013: gen edcount = 1 if _n==1
replace edcount = sum(edcount)

But that gives me a sequential number that doesn't restart at 1 for each new
patient ID.

Is reshaping the right approach, and if so, how do I best create the
variable I need?

Thanks in advance for your time,

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index