Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: question about merge / lookup (was Re: statlist: offlist follow up)


From   "Sebastian Kruk" <residuo.solow@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: question about merge / lookup (was Re: statlist: offlist follow up)
Date   Tue, 17 Jul 2007 16:33:28 -0300

Mi mastaer database is a household survey, I have about 245 variables
but I have to add population proyections according to agegp, dtpo and
e1. All the observatios with the same agegp, dpto and e1 has the same
proyection. For example, I have four observations where
dpto==agegp==e1==1 but it differ in other variables, anyway all have
the same proyections.

I have 10 men. Half are 50 or more years old and the other half are
younger. One quarter is from dpto 1,  two quarter from dpto 2 and one
quarter for dpto 3. But one has 2 children, two has 3, 5 have 10
children, etc. But population proyection is the same for 50+ year old
men from dpto 1.

Dpto, agegp and e1  is a unique set of key variables in using dataset.

I found a mistake in my a.dta using dataset. I should be:

agegp dpto e1 proyeccion
0 1 1 9261
1 1 1 36894
5 1 1 47986
1 1 2 8863
5 1 2 35504
0 2 2 46194
1 2 1 47042
5 2 1 10401
1 2 2 1543
5 2 2 519

b.dta:
agegp dpto e1 vaca0 vaca1
0 1 1 9261 6174
1 1 1 36894 24596
5 1 1 47986 31990
0 1 1 48976 32650
1 1 2 8863 5908
5 1 2 35504 23669
0 2 2 2133 455
1 2 1 2212 48971
5 2 1 108 1170
0 2 2 20 4304
1 2 2 238 566
5 2 2 2 72
0 1 1 9261 61
1 1 1 36894 245
5 1 1 47986 31
0 1 1 48976 3
1 1 2 8863 590
5 1 2 35504 236
0 2 2 213 455
1 2 1 221 48
5 2 1 108 11
0 2 2 2074 43
1 2 2 238 5
5 2 2 269 72

So when I merge:

agegp	dpto	e1	vaca1	vaca2	proyeccion
0	1	1	48976	32650	9261
0	1	1	9261	61	9261
0	1	1	48976	3	9261
0	1	1	9261	6174	9261
0	2	2	20	4304	46194
0	2	2	213	455	46194
0	2	2	2133	455	46194
0	2	2	2074	43	46194
1	1	1	36894	245	36894
1	1	1	36894	24596	36894
1	1	2	8863	5908	8863
1	1	2	8863	590	8863
1	2	1	221	48	47042
1	2	1	2212	48971	47042
1	2	2	238	5	1543
1	2	2	238	566	1543
5	1	1	47986	31990	47986
5	1	1	47986	31	47986
5	1	2	35504	23669	35504
5	1	2	35504	236	35504
5	2	1	108	1170	10401
5	2	1	108	11	10401
5	2	2	269	72	519
5	2	2	2	72	519

So is a many-to-one merge.

Bye,

Sebastian.

2007/7/17, Michael Blasnik <michael.blasnik@verizon.net>:
...
I don't understand how this merge is supposed to work -- it looks like a
"many-to-many" merge because there is no unique set of key variables in either
dataset.  I thought the proyeccion was supposed to be some population info, but
then why does it have two entries for the values 0 2 2 for agegp dpto e1?

You need to come up with a way for Stata to determine which observations go
together or you will end up with multiple matches.

Michael Blasnik
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index