Cusum plot for binary variables [STB-12: sqv7] ------------------------------- ^cusumb^ yvar xvar [^if^] [^in^] [^, yf^it^(^fitvar^) nog^raph ^noca^lc ^gen(^newvar^)^ graph_options ] ^cusumb^ produces a plot of the cumulative sum (cusum) of a binary (0/1) variable yvar against a (usually) continuous variable xvar. The cusum is the partial sum of proportion-of-ones-in-the-sample minus yvar. ^cusumb^ sorts the data in the order of xvar. Tied values of xvar are broken at random, but always in identical fashion for a given dataset. Interpretation: A U- or inverted U-shaped cusum indicates respectively a negative or a positive trend of yvar with xvar. A sinusoidal shape is evidence of a non-monotonic (for example, quadratic) trend of yvar with xvar. ^cusumb^ displays the maximum absolute cusum for monotonic and non-monotonic trends of yvar on xvar. These are nonparametric tests of departure from randomness of yvar with respect to xvar. Approximate P values for the tests are given. Options follow. Options ------- ^yfit()^ calculates a cusum against fitvar, that is the partial sums of the `residuals' fitvar minus yvar. Typically, fitvar is the predicted probability of a one obtained from a logistic regression analysis. ^nograph^ suppresses the plot. ^nocalc^ suppresses calculation of the cusum test statistics. ^gen()^ saves the cusum in newvar. graph_options are any of the standard Stata graph options. Example follows. Example ------- For the automobile dataset ^auto.dta^, we might wish to investigate the relationship between ^foreign^ (0 = domestic, 1 = foreign) and car ^weight^: . ^use auto^ . ^cusumb foreign weight^ The resulting plot is U-shaped, suggesting a negative monotonic relationship. This trend is confirmed by a highly significant linear cusum statistic. In English, the proportion of foreign cars diminishes with increasing weight. Stated crudely, the domestic cars are heavier than the foreign ones. We could have discovered that by typing ^tabulate foreign, summarize(weight)^ but such an approach does not give the full picture of the relationship. The quadratic cusum (cusumQ) is not significant, so we do not suspect any tendency for the very heavy cars to be foreign rather than domestic. Note that this example is artificial as we would not really try to model the probability of a car being foreign given its weight! Saved results ------------- ^cusumb^ saves in the system macros $S_1, ..., $S_8: ^$S_1^ Number of observations ^$S_2^ Proportion of ones in yvar ^$S_3^ Maximum linear cusum ^$S_4^ Normal deviate for $S_3 ^$S_5^ P values for $S_3 ^$S_6^ Maximum quadratic cusum ^$S_7^ Normal deviate for $S_6 ^$S_8^ P values for $S_6 Author ------ Patrick Royston, Royal Postgraduate Medical School, London.