Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Re: Reshaping dataset


From   "Andrea Molinari" <anmolinari@gmail.com>
To   "statalist" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: RE: Re: Reshaping dataset
Date   Fri, 3 May 2013 11:58:52 +0000

Yes, you're totally right! Joinby was a much better option! 
Many thanks for your help! 
Cheers, 
Andrea
-----Original Message-----
From: "Sarah Edgington" <sedging@ucla.edu>
Sender: owner-statalist@hsphsun2.harvard.edu
Date: Thu, 2 May 2013 10:44:09 
To: <statalist@hsphsun2.harvard.edu>
Reply-To: statalist@hsphsun2.harvard.eduSubject: RE: st: RE: Re: Reshaping dataset

Read the manual entry for merge carefully.
You most probably do NOT want to do a many to many merge (i.e. -merge m:m-).
Really unpredictable things can happen with that kind of merge and that's
probably what's causing your results to differ.
If cuci5d is not unique in one of your two datasets, you'll need to think
very carefully about what the merge should actually look like.  Depending on
what your data looks like and what your desired outcome is, -joinby- might
help.
-Sarah

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Andrea Molinari
Sent: Wednesday, May 01, 2013 8:03 PM
To: statalist
Subject: Re: st: RE: Re: Reshaping dataset

Yep, that was the problem with the command. Now that I was able to run the
whole set of commands, I get something quite weird, and I'm not really sure
about what step is causing it...

I get different results for "valrm" as I run and re-run the do-file...

I copy the syntax below, does anyone know what might be hapening?

Cheers!
Andrea

////////////////////////////////
clear

set mem 1g

use "cadenas.dta"

sort cuci5d

save "cadenas.dta", replace

clear

use "datos pry.dta"

sort cuci5d

merge m:m cuci5d using "cadenas.dta"

assert value==. if _merge==2

drop if _merge==2

drop _merge

save "prycadenas.dta", replace


clear

use "usoecon.dta"

sort usoecon

clear

use "prycadenas.dta"

merge m:1 usoecon using "usoecon.dta"

drop if _merge==1
drop if _merge==2

drop _merge

drop hs cuci5d flow usoecon

save "prycadenas.dta", replace


bysort year partner cadena subcadena flores: egen double svalue=sum(value)

bysort year partner cadena subcadena flores: keep if _n==1

drop value

reshape wide svalue, i(year cadena subcadena flores) j(partner)


gen double valmcs=svalue32+svalue76+svalue858

rename svalue0 valwld

gen valrm=valwld-valmcs

////////////////////////////////

On 1 May 2013 21:19, Sarah Edgington <sedging@ucla.edu> wrote:
> Andrea,
> What syntax did you use?  It sounds like you tried to do a 1:1 merge 
> when you needed a 1:m merge.
> If I'm reading your description right, you have 1 observation per SITC 
> in your master data and you want that to match to ALL the chains with 
> that SITC in your using data.  As long as both those things are true, 
> merge 1:m should get you what you need.
> -Sarah
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Andrea 
> Molinari
> Sent: Wednesday, May 01, 2013 4:05 PM
> To: statalist
> Subject: st: Re: Reshaping dataset
>
> Dear statalisters,
>
> It´s me again trying to reshape a piece of my dataset.
>
> I need to assign values from one trade classification (SITC) to 
> another (chain), but with the complexity that there may be one SITC 
> that corresponds to more than one chain. I then need to sum (with
> -egen-) the values by SITC to group them into the chain classification.
>
> When I tried to use the -merge- command to do this, as the identifying 
> variable to use -merge- (SITC) "does not uniquely identify 
> observations in the using data" (sic), the system does not allow me to 
> merge the two datasets.
>
> Does anyone know of any other command that allows me to do this?
>
> Cheers!
> Andrea
>
> On 26 April 2013 13:24, Andrea Molinari <anmolinari@gmail.com> wrote:
>> Dear statalisters,
>>
>> I´m working with a dataset which groups many dimensions and I´m 
>> having a little trouble reshaping the data for the (rather basic) 
>> calculations I need to do.
>>
>> The dataset has the following columns:
>>
>> year flow partner value cadena usoecon subcadena cadenacompartida1
>> subcadenacompartida1 cadenacompartida2 subcadenacompartida2
>>
>> In order to regroup the data summing "value" by year, flow, cadena 
>> subcadena and usoecon, I need that:
>>
>> - the values in cadenacompartida1 and cadenacompartida2 go under 
>> those in the column "cadena"
>>
>> - the values in subcadenacompartida1 and "subcadenacompartida2"   go
>> under those in the column "subcadena"
>>
>> To do so, I tried several options with -reshape long-, but I don´t 
>> seem to get the right reshaping to get the data in the way I need to 
>> then calculate:
>>
>> bysort year flow cadena subcadena usoecon: egen double
>> svalue=sum(value)
>>
>> Any ideas of those handling large datasets would be more than welcomed!
>>
>> Cheers,
>> Andrea
>>
>> --
>> Andrea Molinari, PhD
>> Investigadora Asistente
>> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) 
>> Instituto Interdisciplinario de Economía Política de Buenos Aires
>> (IIEP- BAIRES) Córdoba 2122, 2do. piso
>> (http://iiep-baires.econ.uba.ar)
>> Tel: +54 11 4374-4448, int. 6362
>
>
>
> --
> Andrea Molinari, PhD
> Investigadora Asistente
> Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) 
> Instituto Interdisciplinario de Economía Política de Buenos Aires 
> (IIEP-
> BAIRES) Córdoba 2122, 2do. piso (http://iiep-baires.econ.uba.ar)
> Tel: +54 11 4374-4448, int. 6362
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/



--
Andrea Molinari, PhD
Investigadora Asistente
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
Instituto Interdisciplinario de Economía Política de Buenos Aires (IIEP-
BAIRES) Córdoba 2122, 2do. piso (http://iiep-baires.econ.uba.ar)
Tel: +54 11 4374-4448, int. 6362

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index