Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: merging data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: merging data
Date   Mon, 21 Oct 2002 18:10:59 +0100

Ngo,PT (pgr)

> In one data set, I have data indicating the size (s9a1q04)
> of each plot (s9a1plot) cultivated by a household.
>
> s9a1plot    s9a1q04
>  1           816
>  2           456
>  3           384
>  4           360
>  5           360
>  6           204
>  7           180
>
>
> In another data set, I have data which show which plots (5
> plots maximum, listed under the variables plot 1 to plot5)
> have been planted for each rice season (riceseason, there
> are 7 categories in total).
>
> riceseason   plot1     plot2     plot3     plot4     plot5
>    1         1         3         4         5         6
>    3         1         4         5         6         .
>    6         2         3         7         .         .
>
> I would like to calculate the landsize planted for each rice season.
>
> I have merge my data, and I obtain the following:
>
> l  s9a1plot s9a1q04  riceseason  plot1- plot5 if house==1101
>
>  s9a1plot    s9a1q04  riceseason   plot1     plot2
> plot3     plot4     plot5
>  1           816         1         1         3         4
>      5         6
>  2           456         3         1         4         5
>      6         .
>  3           384         6         2         3         7
>      .         .
>  4           360         6         2         3         7
>      .         .
>  5           360         6         2         3         7
>      .         .
>  6           204         6         2         3         7
>      .         .
>  7           180         6         2         3         7
>      .         .
>
> Now, I would like to allocate for each season the landsize
> of each plot, so something like this:
>
>  s9a1plot    s9a1q04  riceseason   plot1     plot2
> plot3     plot4     plot5
>  1           816         1         816       384       360
>      360       204
>  2           456         3         816       360       360
>       204         .
>  3           384         6         456       384       180
>      .         .
> (The lines below are repetitions.)
>  4           360         6         2         3         7
>      .         .
>  5           360         6         2         3         7
>      .         .
>  6           204         6         2         3         7
>      .         .
>  7           180         6         2         3         7
>      .         .
>

I'd go back one stage. Your -merge- solves one problem
but creates another.

Reading in your second data set, which presumably also
includes a -house- variable,

riceseason   plot1     plot2     plot3     plot4     plot5
    1         1         3         4         5         6
    3         1         4         5         6         .
    6         2         3         7         .         .

first -reshape- to long,

. reshape long plot, i(house riceseason)

and clean up

. drop if plot == .

Then prepare for -merge-

. sort house plot
. rename plot s9a1plot
. save data2
. clear

Now in your first data set

. sort house s9a1plot
. merge house s9a1plot using data2

you should get a long data set,
and you can -collapse- or

. bysort house season : egen total = sum(s9a1q04)



Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index