Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?


From   Nick Cox <njcoxstata@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: How do you select and describe a single variable of interest from a merged dataset but avoiding duplication (due to the merge)?
Date   Thu, 4 Apr 2013 17:48:15 +0100

Look at the -tag()- function of the -egen- command. (It was mentioned
earlier today in another thread; posters should read Statalist as well
as write to it!)

Here is a dopey example:

. sysuse auto, clear
(1978 Automobile Data)

. egen tag = tag(rep78)

. l rep78 if tag

     +-------+
     | rep78 |
     |-------|
  1. |     3 |
  5. |     4 |
 12. |     2 |
 20. |     5 |
 40. |     1 |
     +-------+

Nick
njcoxstata@gmail.com

On 4 April 2013 17:39, Gwinyai Masukume <parturitions@gmail.com> wrote:

> I have a single dataset obtained by merging two datasets (these 2
> datasets are related – obtained from a relational database).
> e.g. 1st dataset was of patients and the second dataset was of their
> hospital visits – a single patient can have multiple hospital visits.
> So the merged dataset has many entries for a single patient.
>
> In my merged data set, I would like to analyze say patient age
> (assuming it’s fixed for that patient regardless of the number of
> visits). Since a single patient has the same age for their different
> hospital visits, a command like “sum Age” will give too many
> observations for age (duplication).
>
> Each patient has a unique ID (identification number).
> How do I issue a command to only count 1 age for each unique patient
> ID and then summarize this information?
> I have tried using the duplicates command to drop other hospital
> visits and remain with one visit, then pick say patient age from this
> to avoid the duplication mentioned above.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index