Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Fabian Schönenberger" <sch.f@gmx.ch> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: merge |

Date |
Fri, 01 Jun 2012 19:03:19 +0200 |

Dear Nick I should be more precise. The using-dataset consists of three variables: cusip, year and capm_marketpremium. Cusip and year I need to identify each observation. The masterfile consists of cusip, year, the before-mentioned additional date-variables and many variables more. So, there are no other overlapping variables beyond those, who identify each observation. I am not sure if overlapping is an issue for my case. All I want is to "copy" capm_marketpremium from using-dataset to the masterfile for each observation identified by cusip and year. Are there other possible problems with merge I do not consider? Many thanks for your support! Fabian -------- Original-Nachricht -------- > Datum: Fri, 1 Jun 2012 09:42:54 -0700 > Von: Nick Sanders <sandersn@stanford.edu> > An: statalist@hsphsun2.harvard.edu > Betreff: Re: st: RE: merge > Hello Fabian, > > Depending on what you want to accomplish, you might consider (1) naming > the variables differently across data sets, or (2) merging on date as well > (if that's an important distinction for your data). But having such variables > can make merge outcomes look weird if you don't know what to look for. > > In general, for an idea of what Stata does in those situations, I'd take a > look at the section in the merge help on "Treatment of overlapping > variables", which I've copied below. > > Best, > Nick > > > Treatment of overlapping variables > > When performing merges of any type, the master and using datasets may > have variables in > common other than the key variables. We will call such variables > overlapping variables. > For instance, if the variables in the master and using datasets are > > master: id, region, sex, age, race > > using: id, sex, bp, race > > and id is the key variable, then the overlapping variables are sex and > race. > > By default, merge treats values from the master as inviolable. When > observations match, it > is the master's values of the overlapping variables that are recorded > in the merged result. > > If you specify the update option, however, then all missing values of > overlapping variables > in matched observations are replaced with values from the using data. > Because of this new > behavior, the merge codes change somewhat. Codes 1 and 2 keep their > old meaning. Code 3 > splits into codes 3, 4, and 5. Codes 3, 4, and 5 are filtered > according to the following > rules; the first applicable rule is used. > > 5 corresponds to matched observations where at least one > overlapping variable had > conflicting nonmissing values. > 4 corresponds to matched observations where at least one missing > value was updated, but > there were no conflicting nonmissing values. > 3 means observations matched, and there were neither updated > missing values nor > conflicting nonmissing values. > > If you specify both the update and replace options, then the _merge==5 > cases are updated > with values from the using data. > > > On Jun 1, 2012, at 9:07 AM, Fabian Schönenberger wrote: > > > Yes I do! There is a variable called datadate with dates like ddmmyyyy, > as well as one variable with dates like datemonthly with dates like > yyyymXX. > > The date-variable I use to uniquely identify the observations is called > year with yyyy. > > > > Should I drop the other ones? > > > > > > > > > > -------- Original-Nachricht -------- > >> Datum: Fri, 1 Jun 2012 08:19:49 -0700 > >> Von: Nick Sanders <sandersn@stanford.edu> > >> An: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> > >> Betreff: Re: st: RE: merge > > > >> Fabian, > >> > >> As a shot in the dark, do you have any other variables that are common > >> among the two data sets? For example, if data set 1 and 2 both contain > a > >> variable called "date" (in addition to the common variables on which > you are > >> merging), merging those two data sets will sometimes make your results > look > >> odd due to how Stata handles what to do with that variable. > >> > >> Best, > >> Nick > >> > >> > >> > >> On Jun 1, 2012, at 8:05 AM, Simon Falck <simon.falck@abe.kth.se> wrote: > >> > >>> Dear Fabian, > >>> > >>> It is difficult to give a specific reply when you do not tell us more > >> about your datasets and key variables. However, here are some general > inputs > >> on merging files in Stata that perhaps are useful for you. > >>> > >>> The -merge- command enables merging files with common id´s. > One-to-one > >> -merge 1:1- implies that the identifiers (key variables) are exactly > the > >> same in both files. If this is not your case then you should consider > >> Many-to-one -merge m:1- or One-to-many -merge 1:m-. It depends on how > your > >> datasets are structured and their content. The two latter options are > used when > >> you have a common id in both files and one file, either the master or > user, > >> differ in for example time period. If you have two datasets which > includes > >> id´s that are non-common in both directions you use the Many-to-many > >> -merge m:m- option. > >>> > >>> As understood the key-variables are important to inspect according to > >> the options described above. Don’t forget to inspect the format, for > >> example are both key variables in string format? Consider how a > key-variables > >> should be constructed and what a common attribute implies and the > directions. > >>> > >>> You get a great instruction by typing: -help merge- > >>> > >>> Good luck, > >>> Simon > >>> > >>> > >>> > >>> -----Original Message----- > >>> From: owner-statalist@hsphsun2.harvard.edu > >> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of "Fabian > Schönenberger" > >>> Sent: den 1 juni 2012 14:28 > >>> To: statalist@hsphsun2.harvard.edu > >>> Subject: st: merge > >>> > >>> Dear Statalist > >>> I try to merge two datasets. In both files each observation is > uniquely > >> identified by cusip and year. I sort both files with xtset cusip year. > >> Afterwards, I command: > >>> > >>> merge 1:1 cusip year using "C:\Users\User\Documents\Uni > >> SG\Doktorat\Data\Price Data\pricedatev5.dta", > keepusing(capm_marketpremium), keep(3) > >>> > >>> I am only interested in those observations of cusip-year-combinations > >> which are in my masterfile - therefore keep(3). > >>> > >>> However, either the matched observations nor the unmatched > observations > >> are correct, meaning that I get for each cusip-year observations the > wrong > >> capm_marketpremium and I do not get observations in my masterfile for > >> cusip-year observations although the using file has an observation. > >>> > >>> I also tried merge m:m but it did not work. What am I doing the wrong > >> way? > >>> > >>> Many thanks for suggestions. > >>> > >>> Fabian > >>> -- > >>> NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! > > >> > >>> Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a > >>> * > >>> * For searches and help try: > >>> * http://www.stata.com/help.cgi?search > >>> * http://www.stata.com/support/statalist/faq > >>> * http://www.ats.ucla.edu/stat/stata/ > >>> > >>> * > >>> * For searches and help try: > >>> * http://www.stata.com/help.cgi?search > >>> * http://www.stata.com/support/statalist/faq > >>> * http://www.ats.ucla.edu/stat/stata/ > >> > >> * > >> * For searches and help try: > >> * http://www.stata.com/help.cgi?search > >> * http://www.stata.com/support/statalist/faq > >> * http://www.ats.ucla.edu/stat/stata/ > > > > -- > > NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! > > > Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: merge***From:*Nick Sanders <sandersn@stanford.edu>

**References**:**st: merge***From:*"Fabian Schönenberger" <sch.f@gmx.ch>

**st: RE: merge***From:*Simon Falck <simon.falck@abe.kth.se>

**Re: st: RE: merge***From:*Nick Sanders <sandersn@stanford.edu>

**Re: st: RE: merge***From:*"Fabian Schönenberger" <sch.f@gmx.ch>

**Re: st: RE: merge***From:*Nick Sanders <sandersn@stanford.edu>

- Prev by Date:
**Re: st: Comparing multiple means with survey data--revisited** - Next by Date:
**st: RE: plotting time series** - Previous by thread:
**Re: st: RE: merge** - Next by thread:
**Re: st: RE: merge** - Index(es):