Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: problems declaring convergence with weighted data?
From
Stas Kolenikov <[email protected]>
To
[email protected]
Subject
st: problems declaring convergence with weighted data?
Date
Wed, 25 May 2011 23:21:54 -0500
Dear Statalisters (and Stata Corp),
I am working with complex survey data, and am somewhat surprised that
running some -ml- estimators with weighted data faces numerical
difficulties. Consider this example:
webuse nhanes2, clear
* this one converges without any issues
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk
* this one takes forever to converge, so I limited it to 50 iterations
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , iter(50)
* OK, the pseudo-likelihood is a huge number because of weights, so
the convergence criteria have to be rescaled
sum finalwgt
local mw = r(mean)
* this one converges, but numeric problems are reported
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nrtol( `=1e-5*`mw''
)
* this one, finally, converges
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nrtol( `=1e-3*`mw''
)
* this one converges, too, but probably to an inferior solution
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nonrtol
I am running this as -mlogit ... [pw=weight]- rather than -svy :
mlogit ... - so as to see the iteration history, as well as obtain the
important -e(ll)- statistic. In my actual application, -mlogit ...
[pw=weight]- converged, while -svy: mlogit ... - did not, so I also
tried a different scaling of the -nrtol()- by setting it to something
like abs(e(ll))*1e-5. -svy: mlogit ... - does not report e(ll).
I would expect that the estimators, especially -svy-, would recognize
that the pseudo-likelihood will be of the order -e(N_pop)- rather than
-e(N)-, and hence the convergence criteria would be scaled
accordingly. Does this make sense? Since -svy- is aware that the
command it runs is a likelihood-based one (as evidenced by suppressed
-e(ll)- statistic), it would probably want to redefine the -nrtol()-,
or whichever option prevents the maximizer from declaring convergence.
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/