Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Fabian Schönenberger" <[email protected]> |

To |
[email protected] |

Subject |
Re: st: RE: merge |

Date |
Fri, 01 Jun 2012 19:03:19 +0200 |

```
Dear Nick
I should be more precise.
The using-dataset consists of three variables: cusip, year and capm_marketpremium. Cusip and year I need to identify each observation. The masterfile consists of cusip, year, the before-mentioned additional date-variables and many variables more. So, there are no other overlapping variables beyond those, who identify each observation. I am not sure if overlapping is an issue for my case. All I want is to "copy" capm_marketpremium from using-dataset to the masterfile for each observation identified by cusip and year.
Are there other possible problems with merge I do not consider?
Many thanks for your support!
Fabian
-------- Original-Nachricht --------
> Datum: Fri, 1 Jun 2012 09:42:54 -0700
> Von: Nick Sanders <[email protected]>
> An: [email protected]
> Betreff: Re: st: RE: merge
> Hello Fabian,
>
> Depending on what you want to accomplish, you might consider (1) naming
> the variables differently across data sets, or (2) merging on date as well
> (if that's an important distinction for your data). But having such variables
> can make merge outcomes look weird if you don't know what to look for.
>
> In general, for an idea of what Stata does in those situations, I'd take a
> look at the section in the merge help on "Treatment of overlapping
> variables", which I've copied below.
>
> Best,
> Nick
>
>
> Treatment of overlapping variables
>
> When performing merges of any type, the master and using datasets may
> have variables in
> common other than the key variables. We will call such variables
> overlapping variables.
> For instance, if the variables in the master and using datasets are
>
> master: id, region, sex, age, race
>
> using: id, sex, bp, race
>
> and id is the key variable, then the overlapping variables are sex and
> race.
>
> By default, merge treats values from the master as inviolable. When
> observations match, it
> is the master's values of the overlapping variables that are recorded
> in the merged result.
>
> If you specify the update option, however, then all missing values of
> overlapping variables
> in matched observations are replaced with values from the using data.
> Because of this new
> behavior, the merge codes change somewhat. Codes 1 and 2 keep their
> old meaning. Code 3
> splits into codes 3, 4, and 5. Codes 3, 4, and 5 are filtered
> according to the following
> rules; the first applicable rule is used.
>
> 5 corresponds to matched observations where at least one
> overlapping variable had
> conflicting nonmissing values.
> 4 corresponds to matched observations where at least one missing
> value was updated, but
> there were no conflicting nonmissing values.
> 3 means observations matched, and there were neither updated
> missing values nor
> conflicting nonmissing values.
>
> If you specify both the update and replace options, then the _merge==5
> cases are updated
> with values from the using data.
>
>
> On Jun 1, 2012, at 9:07 AM, Fabian Schönenberger wrote:
>
> > Yes I do! There is a variable called datadate with dates like ddmmyyyy,
> as well as one variable with dates like datemonthly with dates like
> yyyymXX.
> > The date-variable I use to uniquely identify the observations is called
> year with yyyy.
> >
> > Should I drop the other ones?
> >
> >
> >
> >
> > -------- Original-Nachricht --------
> >> Datum: Fri, 1 Jun 2012 08:19:49 -0700
> >> Von: Nick Sanders <[email protected]>
> >> An: "[email protected]" <[email protected]>
> >> Betreff: Re: st: RE: merge
> >
> >> Fabian,
> >>
> >> As a shot in the dark, do you have any other variables that are common
> >> among the two data sets? For example, if data set 1 and 2 both contain
> a
> >> variable called "date" (in addition to the common variables on which
> you are
> >> merging), merging those two data sets will sometimes make your results
> look
> >> odd due to how Stata handles what to do with that variable.
> >>
> >> Best,
> >> Nick
> >>
> >>
> >>
> >> On Jun 1, 2012, at 8:05 AM, Simon Falck <[email protected]> wrote:
> >>
> >>> Dear Fabian,
> >>>
> >>> It is difficult to give a specific reply when you do not tell us more
> >> about your datasets and key variables. However, here are some general
> inputs
> >> on merging files in Stata that perhaps are useful for you.
> >>>
> >>> The -merge- command enables merging files with common id´s.
> One-to-one
> >> -merge 1:1- implies that the identifiers (key variables) are exactly
> the
> >> same in both files. If this is not your case then you should consider
> >> Many-to-one -merge m:1- or One-to-many -merge 1:m-. It depends on how
> your
> >> datasets are structured and their content. The two latter options are
> used when
> >> you have a common id in both files and one file, either the master or
> user,
> >> differ in for example time period. If you have two datasets which
> includes
> >> id´s that are non-common in both directions you use the Many-to-many
> >> -merge m:m- option.
> >>>
> >>> As understood the key-variables are important to inspect according to
> >> the options described above. Don’t forget to inspect the format, for
> >> example are both key variables in string format? Consider how a
> key-variables
> >> should be constructed and what a common attribute implies and the
> directions.
> >>>
> >>> You get a great instruction by typing: -help merge-
> >>>
> >>> Good luck,
> >>> Simon
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: [email protected]
> >> [mailto:[email protected]] On Behalf Of "Fabian
> Schönenberger"
> >>> Sent: den 1 juni 2012 14:28
> >>> To: [email protected]
> >>> Subject: st: merge
> >>>
> >>> Dear Statalist
> >>> I try to merge two datasets. In both files each observation is
> uniquely
> >> identified by cusip and year. I sort both files with xtset cusip year.
> >> Afterwards, I command:
> >>>
> >>> merge 1:1 cusip year using "C:\Users\User\Documents\Uni
> >> SG\Doktorat\Data\Price Data\pricedatev5.dta",
> keepusing(capm_marketpremium), keep(3)
> >>>
> >>> I am only interested in those observations of cusip-year-combinations
> >> which are in my masterfile - therefore keep(3).
> >>>
> >>> However, either the matched observations nor the unmatched
> observations
> >> are correct, meaning that I get for each cusip-year observations the
> wrong
> >> capm_marketpremium and I do not get observations in my masterfile for
> >> cusip-year observations although the using file has an observation.
> >>>
> >>> I also tried merge m:m but it did not work. What am I doing the wrong
> >> way?
> >>>
> >>> Many thanks for suggestions.
> >>>
> >>> Fabian
> >>> --
> >>> NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
>
> >>
> >>> Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
> >>> *
> >>> * For searches and help try:
> >>> * http://www.stata.com/help.cgi?search
> >>> * http://www.stata.com/support/statalist/faq
> >>> * http://www.ats.ucla.edu/stat/stata/
> >>>
> >>> *
> >>> * For searches and help try:
> >>> * http://www.stata.com/help.cgi?search
> >>> * http://www.stata.com/support/statalist/faq
> >>> * http://www.ats.ucla.edu/stat/stata/
> >>
> >> *
> >> * For searches and help try:
> >> * http://www.stata.com/help.cgi?search
> >> * http://www.stata.com/support/statalist/faq
> >> * http://www.ats.ucla.edu/stat/stata/
> >
> > --
> > NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
>
> > Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```

**Follow-Ups**:**Re: st: RE: merge***From:*Nick Sanders <[email protected]>

**References**:**st: merge***From:*"Fabian Schönenberger" <[email protected]>

**st: RE: merge***From:*Simon Falck <[email protected]>

**Re: st: RE: merge***From:*Nick Sanders <[email protected]>

**Re: st: RE: merge***From:*"Fabian Schönenberger" <[email protected]>

**Re: st: RE: merge***From:*Nick Sanders <[email protected]>

- Prev by Date:
**Re: st: Comparing multiple means with survey data--revisited** - Next by Date:
**st: RE: plotting time series** - Previous by thread:
**Re: st: RE: merge** - Next by thread:
**Re: st: RE: merge** - Index(es):