Stata 15 help for joinby

[D] joinby -- Form all pairwise combinations within groups


joinby [varlist] using filename [, options]

options Description ------------------------------------------------------------------------- Options When observations match: update replace missing data in memory with values from filename replace replace all data in memory with values from filename

When observations do not match: unmatched(none) ignore all; the default unmatched(both) include from both datasets unmatched(master) include from data in memory unmatched(using) include from data in filename

_merge(varname) varname marks source of resulting observation; default is _merge nolabel do not copy value-label definitions from filename ------------------------------------------------------------------------- varlist may not contain strLs.


Data > Combine datasets > Form all pairwise combinations within groups


joinby joins, within groups formed by varlist, observations of the dataset in memory with filename, a Stata-format dataset. By join we mean to form all pairwise combinations. filename is required to be sorted by varlist. If filename is specified without an extension, .dta is assumed.

If varlist is not specified, joinby takes as varlist the set of variables common to the dataset in memory and in filename.

Observations unique to one or the other dataset are ignored unless unmatched() specifies differently. Whether you load one dataset and join the other or vice versa makes no difference in the number of resulting observations.

If there are common variables between the two datasets, however, the combined dataset will contain the values from the master data for those observations. This behavior can be modified with the update and replace options.


+---------+ ----+ Options +----------------------------------------------------------

update varies the action that joinby takes when an observation is matched. By default, values from the master data are retained when the same variables are found in both datasets. If update is specified, however, the values from the using dataset are retained where the master dataset contains missing.

replace, allowed with update only, specifies that nonmissing values in the master dataset be replaced with corresponding values from the using dataset. A nonmissing value, however, will never be replaced with a missing value.

unmatched(none|both|master|using) specifies whether observations unique to one of the datasets are to be kept, with the variables from the other dataset set to missing. Valid values are

none ignore all unmatched observations (default) both include unmatched observations from the master and using data master include unmatched observations from the master data using include unmatched observations from the using data

_merge(varname) specifies the name of the variable that will mark the source of the resulting observation. The default name is _merge(_merge). To preserve compatibility with earlier versions of joinby, _merge is generated only if unmatched is specified.

nolabel prevents Stata from copying the value-label definitions from the dataset on disk into the dataset in memory. Even if you do not specify this option, label definitions from the disk dataset do not replace label definitions already in memory.


Setup . webuse child . describe . list . webuse parent . describe . list, sep(0) . sort family_id

Join information on parents from data in memory with information on children from data at . joinby family_id using

Describe the resulting dataset . describe

List the resulting data . list, sepby(family_id) abbrev(12)

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index