Direct standardization (STB-21: sbe11) ---------------------- ^dstndiz^ casevar popvar stratavars [if exp] [in exp], ^by^(groupvars) [^ba^se(#|"string") ^us^ing(filename) ^sav^ing(filename) ^pr^int ^f^ormat("%fmt") ^l^evel(#) Description ----------- ^dstndiz^ generates a summary measure of occurrence which can be used to compare prevalence, incidence, or mortality rates between populations which may differ with respect to certain characteristics (e.g., age, gender, race.) These underlying differences may affect the crude prevalence, mortality, or incidence rates. Options ------- ^by(groupvars)^ is not optional; it specifies the variables identifying the study populations. If ^base()^ is also specified, there must be only one variable in the ^by()^ group. ^using()^ or ^base()^ may be used to specify the standard population. The options can be specified individually or not at all, but not together. ^using(filename)^ specifies the name of a file containing the standard popula- tion. The standard population must contain the ^popvar^ and the ^stratavars^. If ^using^ is not specified, the standard population distribution will be obtained from the data. ^base(#|"string")^ specifies the value of ^groupvar^ which identifies the standard population. If neither ^base()^ nor ^using()^ are specified, the default is to use the entire data set to determine the standard population. ^saving(filename)^ saves the standard population distribution computed in a Stata data set that can be used in further analyses. ^print^ outputs a tabular summary of the standard population distribution before outputting the study populations specified in the ^by()^ option. ^format("%fmt")^ specifies the format in which to display the final summary table. The default is ^%10.0g^. ^level(#)^ specifies the significance level for the confidence intervals of the coefficients. The default is the current value of Stata's ^$S_level^ macro (initially set to 95.) Description ----------- A frequently recurring problem in epidemiology and other fields is the comparison of rates for some characteristic across different populations. These populations often differ with respect to factors associated with the characteristic under study; thus, the direct comparison of overall rates may be quite misleading. The direct method of adjusting for differences among populations involves computing the overall rates that would result, if, instead of having different distributions, all populations were to have the same standard distribution. The standardized rate is defined as a weighted average of the stratum-specific rates, with the weights taken from the standard distribution. Direct standardization may be applied only when the specific rates for a given population are available. Examples -------- It will be easiest to understand these commands if we start with a simple example. Suppose we have data (Rothman 1986, 42) on mortality rates for Sweden and Panama for the year 1962. . use mortality (1962 Mortality, Sweden & Panama) . de Contains data from mortality.dta Obs: 6 (max= 5117) 1962 Mortality, Sweden & Panama Vars: 4 (max= 99) Width: 15 (max= 200) 1. nation str6 %9s Nation 2. age_cat byte %8.0g age_lbl Age Category 3. pop float %9.0g Population in Age Category 4. deaths float %9.0g Deaths in Age Category Sorted by: . list nation age_cat pop deaths 1. Sweden 0 - 29 3145000 3523 2. Sweden 30 - 59 3057000 10928 3. Sweden 60+ 1294000 59104 4. Panama 0 - 29 741000 3904 5. Panama 30 - 59 275000 1421 6. Panama 60+ 59000 2456 When the total number of cases in the population is divided by the population, we obtain the crude rate: . collapse pop deaths, sum(pop deaths) by(nation) . list nation pop deaths 1. Panama 1075000 7781 2. Sweden 7496000 73555 . gen crude = deaths/pop . list nation pop deaths crude 1. Panama 1075000 7781 .0072381 2. Sweden 7496000 73555 .0098126 If we examine the total number of deaths in the two nations, it is striking that the total crude mortality rate in Sweden is higher than that of Panama. From the original data set, we see one possible explanation: Swedes are older than Panamanians. This makes it difficult to directly compare the mortality rates. Direct standardization gives us a means of removing the distortion caused by the differing age distributions. The adjusted rate is defined as the weighted sum of the crude rates, where the weights are given by the standard distribution. Suppose we wish to standardize these mortality rates to the following age distribution: . use 1962 (Std. Pop. Distribution) . list age_cat pop 1. 0 - 29 .35 2. 30 - 59 .35 3. 60+ .3 . sort age_cat . save 1962, replace . dstndiz deaths pop age_cat, by(nation) using(1962.dta) ---------------------------------------------------------- -> nation= Panama -----Unadjusted----- Std. Pop. Stratum Pop. Stratum Pop. Cases Dist. Rate[s] Dst[P] s*P ---------------------------------------------------------- 0 - 29 741000 3904 0.689 0.0053 0.350 0.0018 30 - 59 275000 1421 0.256 0.0052 0.350 0.0018 60+ 59000 2456 0.055 0.0416 0.300 0.0125 ---------------------------------------------------------- Totals: 1075000 7781 Adjusted Cases: 17351.2 Crude Rate: 0.00724 Adjusted Rate: 0.01614 95% Conf. Interval: [0.01614 0.01614] ---------------------------------------------------------- -> nation= Sweden -----Unadjusted----- Std. Pop. Stratum Pop. Stratum Pop. Cases Dist. Rate[s] Dst[P] s*P ---------------------------------------------------------- 0 - 29 3145000 3523 0.420 0.0011 0.350 0.0004 30 - 59 3057000 10928 0.408 0.0036 0.350 0.0013 60+ 1294000 59104 0.173 0.0457 0.300 0.0137 ---------------------------------------------------------- Totals: 7496000 73555 Adjusted Cases: 115032.5 Crude Rate: 0.00981 Adjusted Rate: 0.01535 95% Conf. Interval: [0.01535 0.01535] Summary of Study Populations: nation N Crude Adj_Rate Confidence Interval Panama 1075000 0.007238 0.016141 [ 0.016139 0.016143] Sweden 7496000 0.009813 0.015346 [ 0.015346 0.015346] Note ---- Due to improvements to ^dstndiz^ after STB-21 went to press, the format of the screen output may differ slightly from the examples in insert sbe11. Authors ------- Tim McGuire and Joel Harrison, Stata Technical Bulletin Also see -------- STB: sbe11 (STB-21)