Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Test for trend in surveys


From   Steven Samuels <sjhsamuels@earthlink.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Test for trend in surveys
Date   Thu, 2 Oct 2008 11:37:19 -0400

聲gel,

You asked for a simple non-regression command for a trend in proportions with complex surveys. As you can see, there is none. All in all, I would go with Phil's original suggestion of using -svy: logit-.

In Stata, the Cochran-Armitage test is the test for trend in - tabodds- and -mhodds- (where it is called the "score test"). It is, in fact, the likelihood score test for a non-zero coefficient in a logistic model, but it has good properties when the dose-response curve is increasing or decreasing, although not logistic. (Gart & Tarone, 1983). The following code shows that the trend statistics of the Cochran-Armitage test, -svy: reg-, and -svy: logit- can be similar when applied to independent data.

When you have done your test for trend, you are not finished. If the trend is not significant, you may still have powerful association (think "U" or "inverted U" dose-response). If the trend is significant, other functions of the X variable to may give better prediction.

(Ref: JJ Gart & RE Tarone, The Relation between Score Tests and Approximate UMPU Tests in Exponential Models Common in Biometry, Biometrics, Vol. 39, No. 3 (Sep., 1983), pp. 781-786).


********************CODE STARTS******************
use http://www.stata-press.com/data/r10/bdesop,clear
expand freq
tab tobacco case, row
tabodds case tobacco
svyset _n // triggers robust standard error
svy: reg case tobacco
svy: logit case tobacco
********************CODE ENDS**********************


I believe strata
and clusters are not important because the formula for the standard
error of this nonparametric test (see Stata Reference Manual K-Q page
338) should not be affected by these specifications

The standard error formula for a procedure that is not survey- enabled may say nothing about clustering or stratification, but with clustered/stratified/weighted data, that formula will be wrong. Consider a sample of 4 clusters, 100 people subsampled in each, all weights equal, with the following data: 100/100, 0/100, 0/100, 100/00 . Here the sample proportion is P= 200/400 = 0.5. What is the standard error? Is it, 0.025, the square root of the standard formula PQ/400? If you run -prop-, the answer is SE = 0.02503. (Stata multiplies 0.025 by (400/399)^.5). If you properly -svyset- the data and identify the clusters, -svy: prop- will report a standard error of about 0.29. The "PQ/n" formula is totally misleading, because clustering has reduced the effective sample size to around four.


-Steve
On Oct 2, 2008, at 3:07 AM, 聲gel Rodr璲uez Laso wrote:


Thanks all for your answers.

When I wrote 'of type Pearson chi-squared' I didn't want to mean that
it was specifically chi-squared, but that it was of the type that
could be obtained as an option when performing a plain frequency
analysis, without having to carry out regressions.

Steve's proposal makes me a little bit nervous: I was taught that
using O.L.S. regression for a binary response is inadequate, but I
suppose there are exceptions.

Angel Rodriguez-Laso

2008/10/2 Steven Samuels <sjhsamuels@earthlink.net>:


There is, to my knowledge, no such thing as test for trend of type Pearson
chi-squared. I suspect that 聲gel is referring to the Cochran- Armitage test
one degree-of-freedom chi square test for trend (A. Agresti, 2002,
Categorical Data Analysis, 2nd Ed. Wiley Books, Section 5.3.5).

Let Y be the 0-1 binary outcome variable and X be the variable which
contains category scores. One survey-enabled approach is Phil's suggestion:
use -svy: logit-.

However -svy: reg- will produce a result closer to that of the
Cochran-Armitage test. Why? The Cochran-Armitage test statistic is formally
equivalent to an O.L.S. regression of Y on X, with a standard error for
beta which substitutes the total variance for the residual variance. The
statistic is (beta/se)^2. The total variance is equal to P(1-P), where P is
the overall sample proportion. In other words, the standard error is
computed under the null hypothesis of equal proportions.

The -svy: reg- command will estimate the same regression coefficient, but
with a standard error that is robust to heterogeneity in proportions. In
both survey-enabled commands, t = (b/se) has a t distribution with degrees
of freedom (d.f.) based on the survey design; t^2 has an F(1, d.f.)
distribution.


-Steve



On Sep 30, 2008, at 6:39 AM, Philip Ryan wrote:

Well, the z statistic testing the coefficient on the exposure variable is
as
valid and as useful a summary (test) statistic as the chi-square
statistic
produced by a test of trend in tables. If you prefer chi- squares, you
could
just square the z statistic to get the chi-square on 1 df. And if you
prefer
likelihood ratio chi-squares to the Wald z (or Wald chi-square) then the
modelling approach can deliver that also.

Phil

Quoting 聲gel Rodr璲uez Laso <angelrlaso@gmail.com>:

Thanks to Philip and Neil for their advice.

Philip's proposal is absolutely compatible with survey data, but I was
interested in a summary statistic of the type of Pearson chi- squared.

To this respect, Neil puts forward a test (nptrend) that would be
perfect if it allowed complex survey specifications. I believe strata
and clusters are not important because the formula for the standard
error of this nonparametric test (see Stata Reference Manual K-Q page
338) should not be affected by these specifications. But nptrend does
not accept weights as an option, what I think makes it unsuitable for
complex survey analyses.

Angel Rodriguez Laso

2008/9/29 Philip Ryan <philip.ryan@adelaide.edu.au>:

For a 2 x k table [with a k-category "exposure" variable] just set up a
logistic
dose-response model:

svyset <whatever>
svy: logistic <binary outcome var> <exposure var>

and check the coefficient of <exposure var>, along with its confidence
interval
and P-value.

If you prefer a risk metric rather than odds, then use svy: glm..... with
appropriate link and error specifications.

Phil


Quoting 聲gel Rodr璲uez Laso <angelrlaso@gmail.com>:

Dear Statalisters,

Is there a way to carry out a test for trend in a two-way table in
survey analysis in Stata?

Many thanks.

Angel Rodriguez Laso
*
*--
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index