[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: New version of -keyby- on SSC

From   "Newson, Roger B" <>
To   statalist <>
Subject   st: New version of -keyby- on SSC
Date   Mon, 20 Apr 2009 10:28:51 +0100

Thanks to Kit Baum, a new version of the -keyby- package is now available for download from SSC. In Stata, use the -ssc- command to do this, or -adoupdate- if you already have an earlier version of -keyby-.

The -keyby- package is described as below on my website. The new version has a second module -keybygen-, which sorts a dataset by a varlist, which does not necessarily uniquely identify the observations, and generates a new variable, containing, in each observation, the sequential order of the observation in its by-group. This variable is appended to the end of the existing varlist to form a primary key, which uniquely identifies the observations, and by which the dataset is sorted. The -keyby- package is therefore a "clean" version of -sort-. It has a companion package -addinby-, also downloadable from SSC, which is a "clean" version of -merge-. Together, the 2 packages can be used to enforce the relational database model, in which a dataset is a mathematical function, whose domain is the set of existing value combinations of its primary key variables, and whose range is the set of all possible value combinations of its non-key variables.

Best wishes


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Web page:
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

package keyby from

      keyby: Key the dataset by a variable list

      keyby sorts the dataset currently in memory by the variables in a
      varlist, checking that the variables in the varlist uniquely
      identify the observations.  This makes the variables in the
      varlist a primary key for the dataset in memory.  If the user does
      not specify otherwise, then keyby also reorders the variables in
      the varlist to the start of the variable order in the dataset, and
      checks that all values of these variables are nonmissing.
      keybygen sorts the dataset currently in memory by the variables in
      a varlist, preserving the existing order of observations within
      each by-group, and then generates a new variable, containing the
      sequential order of each observation within its by-group, to form
      a primary key with the existing variables in the varlist.  keyby
      and keybygen can be useful if the user combines multiple datasets
      using merge, which may cause a dataset in memory to become
      Author: Roger Newson
      Distribution-Date: 19april2009
      Stata-Version: 10

INSTALLATION FILES                                  (click here to install)
(click here to return to the previous screen)

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index