Home  /  Stata News  /  Vol 37 No 3  /  In the spotlight: nptrend
The Stata News

«Back to main page

In the spotlight: nptrend

New in Stata 17, the nptrend command has three additional tests for trend: the Cochran–Armitage test, the Jonckheere–Terpstra test, and the linear-by-linear trend test. Also new in Stata 17 is an option for computing exact p-values.

Say you have two variables, x and y, and you want to see whether there is a trend between them, that is, whether larger values of x are associated with larger values of y.

For trend tests, x typically defines groups that are ordered. For example, x might define groups of subjects given different drug doses, say, 10, 20, 30, or 40 mg, in a clinical trial.

The y variable is typically a response. Responses could be 0 or 1, for example, whether or not a drug gave relief for a migraine headache. Or responses could be ordered categories, such as the degree of relief: none, a little, some, a lot, or complete. Or responses could be continuous values, such as the time when relief began.

Let's run an example using nptrend when responses are 0 or 1. Here we have the variable dose containing the dose of the drug given to a subject. The variable relief is 0/1, with 0 indicating no relief of the migraine and 1 indicating partial or total relief.

. tabulate dose relief, row nokey

Relief of migraine
Mycureit after 2 hours
dose in mg 0 1 Total
10 80 120 200
40.00 60.00 100.00
20 92 108 200
46.00 54.00 100.00
30 83 117 200
41.50 58.50 100.00
40 63 137 200
31.50 68.50 100.00
Total 318 482 800
39.75 60.25 100.00

For 0/1 responses, the Cochran–Armitage statistic tests whether there is a linear trend in response probabilities by group.

. nptrend relief, group(dose) carmitage

Cochran–Armitage test for trend

   Number of observations =      800
         Number of groups =        4
Number of response levels =        2

Mean
response Number
Group Group score score of obs
dose
10 10 .6 200
20 20 .54 200
30 30 .585 200
40 40 .685 200
Statistic = .003 Std. err. = .0015476 z = 1.939 Prob > |z| = 0.0526 Test of departure from trend: chi2(2) = 5.45 Prob > chi2 = 0.0656

nptrend not only reports the Cochran–Armitage test for linear trend but also shows a test for departure from linear trend, which is a measure of nonlinear association between relief and dose.

The p-value for the Cochran–Armitage statistic is calculated using a normal approximation to a permutation test. nptrend optionally computes exact permutation p-values. Here a Monte Carlo procedure with 100,000 random permutations is used.

. nptrend relief, group(dose) carmitage
          exact(montecarlo, reps(100000) dots(1000) rseed(1234))

Permutations (100,000): ..........10,000..........20,000..........30,000........
  ..40,000..........50,000..........60,000..........70,000..........80,000......
  ....90,000..........100,000 done

Cochran–Armitage test for trend

   Number of observations =      800
         Number of groups =        4
Number of response levels =        2

Mean
response Number
Group Group score score of obs
dose
10 10 .6 200
20 20 .54 200
30 30 .585 200
40 40 .685 200
Statistic = .003 Std. err. = .0015476 z = 1.939 Prob > |z| = 0.0526 Exact prob = 0.0592 (100,000 Monte Carlo permutations) Test of departure from trend: chi2(2) = 5.45 Prob > chi2 = 0.0656

The exact p-value is 0.0592 and is larger than the normal-approximation p-value of 0.0526. If an accurate p-value is wanted, then it is a good idea to compute an exact p-value by using many Monte Carlo permutations.

The Cochran–Armitage statistic tests for a linear trend in response probabilities for 0/1 responses. If the response is not 0/1, then nptrend with the option linear can be used to perform the linear-by-linear trend test, which is a permutation version of a Pearson correlation. It again tests for a linear trend.

If you want to test for a general trend, not just a linear trend, you can use the Jonckheere–Terpstra test. For any two distinct groups, say, group \(j\) and group \(j'\), the Jonckheere–Terpstra test looks at all possible pairs of responses \(y_{jk}, y_{j'k'}\), where \(k\) and \(k'\) run over all observations in groups \(j\) and \(j'\), respectively. The numbers of concordant and discordant pairs are counted, and the test statistic is the difference in the numbers of concordant and discordant pairs, summed across all pairs of distinct groups. No assumptions whatsoever are made about the functional form of the trend. So the Jonckheere–Terpstra test is a good choice when you want to test for trend but have no idea what the trend might be.

Here's an example. There are three groups of sunglasses, each with a different amount of light transmission. The response is exposure to ultraviolet radiation. Data are

Transmission of
Group visible light   Ocular  exposure  to  ultraviolet  radiation
1 < 25%   1.4  1.4  1.4  1.6  2.3  2.5
2 25 to 35%   0.9  1.0  1.1  1.1  1.2  1.2  1.5  1.9  2.2  2.6  2.6
  2.6  2.8  2.8  3.2  3.5  4.3  5.1
3 > 35%   0.8  1.7  1.7  1.7  3.4  7.1  8.9  13.5

Here's the output from nptrend when computing the Jonckheere–Terpstra test:

. nptrend exposure, group(group) jterpstra

Jonckheere–Terpstra test for trend

   Number of observations =       32
         Number of groups =        3
Number of response levels =       23

Mean
response Number
Group Group score score of obs
group
< 25% 1 1.766667 6
25% to 35% 2 2.311111 18
> 35% 3 4.85 8
Statistic = 82 Std. err. = 54.80056 z = 1.496 Prob > |z| = 0.1346

For such a small sample size, you would likely want to run nptrend again with the exact option to get an exact p-value.

nptrend also computes Cuzick's test for trend, which was available in earlier versions of nptrend.

— by Bill Sribney
Principal Statistician and Software Developer

«Back to main page