Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Comparing Chi2/L2 in different samples using bootstrap

From   Steven Samuels <>
Subject   Re: st: Comparing Chi2/L2 in different samples using bootstrap
Date   Mon, 6 Dec 2010 09:22:33 -0500


Contrary to your belief, it is very likely that the data sets can be pooled: just incorporate survey year into the stratum definition. Write a command like "gen new_stratum = group(year stratum)". Then - svyset- the combined samnple with the new stratum variable, but with PSUs and weights from from the individual years.

Differing yearly design effects and sample sizes have no bearing on the validity of this approach. There might be some difficulty if the types and sizes of PSUs changed greatly between years. Also, you will have to take special steps if the surveys were rotating panels.


Steven J. Samuels
18 Cantine's Island
Saugerties NY 12477
Voice: 845-246-0774
Fax:    206-202-4783

On Dec 6, 2010, at 8:44 AM, Dmitriy Poznyak wrote:

Hello all,

I am estimating three identical multinomial models with bootstrap for the different years of survey data, for instance. 1991, 1999 and 2007. Aside from comparing predicted probabilities, which I assume shouldn't pose any problem, I need to compare Chi2/L2 coefficients for the different variables in the model. The rationale for doing this, is that the fit of the individual predictors (e.g. social-demographic stuff) declines through time. Here's where the question arises. Clearly, samples in different years have different size, and perhaps different design effects, and so on.

In order to possibly address these issues I ran the bootstrapped models with the same number of iterations in each case: bootstrap, reps(2000) force: mlogit vote5 x y z ... ,base(1) cl(zip), [pweight=weight1], rrr Next, I test the effect of the predictors: test x; test z, etc. Again, the models' specification is identical for all years; what differs is the sample size and design.

Considering the bootstrap method being used, will it be possible to compare Chi2/L2 and perhaps pseudo R2 coefficients for different samples in this case, and, if not, what would be my strategy. Note that pooling datasets is not feasible due to several reasons, like weighting, etc.

Thanks for your suggestions,
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index