Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: merging data


From   "Ngo,PT (pgr)" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: merging data
Date   Mon, 21 Oct 2002 18:55:42 +0100

Thank you very much Nick.  This is such a parsimonious solution!  As I received your email, I found out one solution, far from being parsimonious... (fyi see below)

Thanks again.

Thi Minh

use C:\Vlss\Vlss98\Data\Household\SCR09A12.DTA, clear
keep househol s9a1plot s9a1q04
rename s9a1q04 psize
for newlist plot1 plot2 plot3 plot4 plot5 : gen X=s9a1plot 
drop s9a1plot
sort househol plot1 plot2 plot3 plot4 plot5 
save c:\vlss\land\data2.dta, replace

foreach x of varlist plot1 plot2 plot3 plot4 plot5 {
	use c:\vlss\land\data2.dta, clear
	sort househol `x'
	save c:\vlss\land\data2.dta, replace
	use c:\vlss\land\data1.dta, clear
	compress
	sort househol `x'
	merge househol `x' using c:\vlss\land\data2.dta
	drop _m
	rename psize psize`x'
	save c:\vlss\land\data1.dta, replace
	}

drop if riceseason==.
egen riceplottot=rsum(psizeplot1 psizeplot2 psizeplot3 psizeplot4 psizeplot5)
sort househol riceseason
save c:\vlss\land\data1.dta, replace

-----Original Message-----
From: Nick Cox [mailto:[email protected]]
Sent: 21 October 2002 18:11
To: [email protected]
Subject: st: RE: merging data


Ngo,PT (pgr)

> In one data set, I have data indicating the size (s9a1q04)
> of each plot (s9a1plot) cultivated by a household.
>
> s9a1plot    s9a1q04
>  1           816
>  2           456
>  3           384
>  4           360
>  5           360
>  6           204
>  7           180
>
>
> In another data set, I have data which show which plots (5
> plots maximum, listed under the variables plot 1 to plot5)
> have been planted for each rice season (riceseason, there
> are 7 categories in total).
>
> riceseason   plot1     plot2     plot3     plot4     plot5
>    1         1         3         4         5         6
>    3         1         4         5         6         .
>    6         2         3         7         .         .
>
> I would like to calculate the landsize planted for each rice season.
>
> I have merge my data, and I obtain the following:
>
> l  s9a1plot s9a1q04  riceseason  plot1- plot5 if house==1101
>
>  s9a1plot    s9a1q04  riceseason   plot1     plot2
> plot3     plot4     plot5
>  1           816         1         1         3         4
>      5         6
>  2           456         3         1         4         5
>      6         .
>  3           384         6         2         3         7
>      .         .
>  4           360         6         2         3         7
>      .         .
>  5           360         6         2         3         7
>      .         .
>  6           204         6         2         3         7
>      .         .
>  7           180         6         2         3         7
>      .         .
>
> Now, I would like to allocate for each season the landsize
> of each plot, so something like this:
>
>  s9a1plot    s9a1q04  riceseason   plot1     plot2
> plot3     plot4     plot5
>  1           816         1         816       384       360
>      360       204
>  2           456         3         816       360       360
>       204         .
>  3           384         6         456       384       180
>      .         .
> (The lines below are repetitions.)
>  4           360         6         2         3         7
>      .         .
>  5           360         6         2         3         7
>      .         .
>  6           204         6         2         3         7
>      .         .
>  7           180         6         2         3         7
>      .         .
>

I'd go back one stage. Your -merge- solves one problem
but creates another.

Reading in your second data set, which presumably also
includes a -house- variable,

riceseason   plot1     plot2     plot3     plot4     plot5
    1         1         3         4         5         6
    3         1         4         5         6         .
    6         2         3         7         .         .

first -reshape- to long,

. reshape long plot, i(house riceseason)

and clean up

. drop if plot == .

Then prepare for -merge-

. sort house plot
. rename plot s9a1plot
. save data2
. clear

Now in your first data set

. sort house s9a1plot
. merge house s9a1plot using data2

you should get a long data set,
and you can -collapse- or

. bysort house season : egen total = sum(s9a1q04)



Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index