Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: problems with joinby or merge


From   Eva Poen <eva.poen@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: problems with joinby or merge
Date   Mon, 11 May 2009 17:02:22 +0100

<>

Gaby,

please report exactly what you typed in Stata for both -merge- and
-joinby-. It is difficult for us to help you if we have to guess what
you did. See the statalist FAQ on this topic.

-merge- should be the answer to your problem. Suppose you have dataset
1 (individual) in memory, and want to merge using dataset 2. The
following should do the trick:

merge Governorate using dataset2, sort uniqusing nokeep

This may or may not be what you typed in Stata.
One reason why you might get missing values is that the values of
Governorate in dataset 1 don't line up with those in dataset 2. A -tab
Governorate- in both datasets will shed light on this. Also,
investigate the _merge variable after -merge- to see how observations
were composed.

Eva



2009/5/11 Ana Gabriela Guerrero Serdan <ag_guerreroserdan@yahoo.com>:
> Hi,
>
> I have created a new dataset using two datasets. Using the first one where I have individual information I  add for each of these individuals details about the schools in the areas that they live. So the first dataset is at the individual level and looks like:
>
>  sum urban age sex year hhmem meducation disability daysinschool attending work Governorate violencetreated
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       urban |     68813    .5920974    .4914484          0          1
>         age |     68813    11.27095    3.438038          6         17
>         sex |     68812    1.491237    .4999268          1          2
>        year |     68813    2003.147    2.996411       2000       2006
>       hhmem |     68813    9.010521     3.53022          0         39
> -------------+--------------------------------------------------------
>  meducation |     61560    .5910169    .4916501          0          1
>  disability |      1694    .1924439    .3943362          0          1
> daysinschool |     17722    4.824399    .7478702          0          7
>   attending |     61943    .7056165    .4557688          0          1
>        work |     52897    .4894039    .4998924          0          1
> -------------+--------------------------------------------------------
>  Governorate |     68813    9.639022    5.122194          1         18
> violencetr~d |     68813    .3931815    .4884601          0          1
>
>
> The second data set is at the Governorate/area level and looks like:
>
>  sum Governorate  schoolsrehabilitated total_num_schools population
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>  Governorate |        18    23.66667    7.940959         11         35
> schoolsreh~d |        18         131    200.2628         17        902
> total_num_~s |        18    805.8889    405.0946        269       1735
>  population |        18    754408.8    673468.2   237128.4    3262781
>
>
> So for each individual in the first dataset I want to add details about the Governorate/area level.
>
> But I get this:
>
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       urban |     20381    .5575291    .4966916          0          1
>         age |     20381    11.25377    3.437551          6         17
>         sex |     20381    1.491095     .499933          1          2
>        year |     20381    2003.154    2.996112       2000       2006
>       hhmem |     20381    9.327511    3.552114          0         28
> -------------+--------------------------------------------------------
>  meducation |     18255    .6012051    .4896638          0          1
>  disability |        77     .987013    .1139606          0          1
> daysinschool |      5574    4.862217     .680953          0          7
>   attending |     18045    .6682738    .4708463          0          1
>        work |     15730    .4930706    .4999679          0          1
> -------------+--------------------------------------------------------
>  Governorate |     20381    13.06555    1.420832         11         15
> violencetr~d |     20381    .3576861    .4793308          0          1
>
>
> I also tried merge using Governorate and I get this:
>
>
> Variable        Obs     Mean    Std. Dev.       Min     Max
>
> urban   68813   .5920974        .4914484        0       1
> age     68813   11.27095        3.438038        6       17
> sex     68812   1.491237        .4999268        1       2
> year    68813   2003.147        2.996411        2000    2006
> hhmem   68813   9.010521        3.53022 0       39
>
> meducation      61560   .5910169        .4916501        0       1
> disability      1694    .1924439        .3943362        0       1
> daysinschool    17722   4.824399        .7478702        0       7
> attending       61943   .7056165        .4557688        0       1
> work    52897   .4894039        .4998924        0       1
>
> Governorate     68826   9.642446        5.128155        1       35
> violencetr~d    68813   .3931815        .4884601        0       1
> schoolsreh~d    20394   59.68412        30.6863 17      902
> population      20394   725117.7        353806.9        237128.4        3262781
> total_num_~s    20394   1066.16 317.4857        269     1735
>
>
>
> Lots of MISSING VALUES FOR several of the individulas.
>
>
> Can someone let me know what Im doing wrong? How can I add variables for each of the individuals?
>
>
> thanks in advance,
>
> Gaby

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index