Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?


From   Gwinyai Masukume <parturitions@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?
Date   Thu, 4 Apr 2013 18:39:47 +0200

Dear Stata list,

I have a single dataset obtained by merging two datasets (these 2
datasets are related – obtained from a relational database).
e.g. 1st dataset was of patients and the second dataset was of their
hospital visits – a single patient can have multiple hospital visits.
So the merged dataset has many entries for a single patient.

In my merged data set, I would like to analyze say patient age
(assuming it’s fixed for that patient regardless of the number of
visits). Since a single patient has the same age for their different
hospital visits, a command like “sum Age” will give too many
observations for age (duplication).

Each patient has a unique ID (identification number).
How do I issue a command to only count 1 age for each unique patient
ID and then summarize this information?
I have tried using the duplicates command to drop other hospital
visits and remain with one visit, then pick say patient age from this
to avoid the duplication mentioned above.

Thanks for your consideration

Kind regards,
Gwinyai

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index