Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merging longitudinal data set

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: Merging longitudinal data set
Date	Fri, 16 Jul 2010 22:16:47 +0000 (GMT)

--- On Fri, 16/7/10, Andreas Jensen wrote:
> What I'm troubled about is that there are people from wave
> 1 that has dropped out when wave 2 was conducted (their ID 
> does not exist in the wave 2 data file), and there has been
> added additional people in wave 2 that aren't present in
> wave 1 (their ID does not exist in the wave 1 data file).
> 
> I have sorted each data file according to the ID variable
> and then executed a merge 1:1 on the ID with wave 1 as
> master. I get the following output.
> 
>     Result          # of obs.
>     -----------------------------------------
>     not matched     28,046
>     from master     12,373  (_merge==1)
>     from using      15,673  (_merge==2)
> 
>     matched         18,742  (_merge==3)
>     -----------------------------------------
> 
> So assuming that my command is correct, is it then true
> that there are 18742 individuals in both waves, 12373
> individuals which has dropped out after wave 1 and 15673
> individuals that have been added in wave 2?

That interpretation is correct. One thing that might be
going on is the precision problem. Stata stores by default
all variables as floats, i.e. with 8 digits of accuracy. 
I think that is a good default: the typical variable in a
dataset is a measurement of some sort, and we are often
happy if such measurements have 2 digits of accuracy, so 8
digits is more than enough. However, it can cause problems
with id variables: it is not uncommon that ids are 
generated such that they contain more than 8 digits, and
in order for them to match between datasets they need to
be stored exactly. So you need to make sure that when 
this is the case, you import your dataset so that this 
variable is imported as either a double or a long or a 
string. See for example:
<http://www.ats.ucla.edu/stat/Stata/faq/longid.htm>

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Merging longitudinal data set
  - From: Andreas Jensen <[email protected]>

Prev by Date: st: Merging longitudinal data set
Next by Date: Re: st: Merging longitudinal data set
Previous by thread: st: Merging longitudinal data set
Next by thread: Re: st: Merging longitudinal data set
Index(es):
- Date
- Thread