Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: problems declaring convergence with weighted data?

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: problems declaring convergence with weighted data?
Date	Thu, 26 May 2011 12:02:20 -0400

Very interesting, Stas.

A couple of observations:

1. A quick scan shows no difference in the output of the last two models.
2. I rescaled the weights to sum  to sample size (and reset the mw macro) Only the last two weighted models converged, as they did for you and they produced same parameters and standard errors.  The only virtue of the rescaling was log-pseudolikelihood values that were readable (.e.g. "-12825.611" instead of "-1.452e+08").

Steve

On May 26, 2011, at 12:21 AM, Stas Kolenikov wrote:

Dear Statalisters (and Stata Corp),

I am working with complex survey data, and am somewhat surprised that
running some -ml- estimators with weighted data faces numerical
difficulties. Consider this example:

webuse nhanes2, clear
* this one converges without any issues
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk
* this one takes forever to converge, so I limited it to 50 iterations
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , iter(50)
* OK, the pseudo-likelihood is a huge number because of weights, so
the convergence criteria have to be rescaled
sum finalwgt
local mw = r(mean)
* this one converges, but numeric problems are reported
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nrtol( `=1e-5*`mw''
)
* this one, finally, converges
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nrtol( `=1e-3*`mw''
)
* this one converges, too, but probably to an inferior solution
mlogit region age i.sizplace i.hlthstat##i.race
i.sex##c.bpsys##c.bpdias heartatk [pw=finalwgt] , nonrtol

I am running this as -mlogit ... [pw=weight]- rather than -svy :
mlogit ... - so as to see the iteration history, as well as obtain the
important -e(ll)- statistic. In my actual application, -mlogit ...
[pw=weight]- converged, while -svy: mlogit ... - did not, so I also
tried a different scaling of the -nrtol()- by setting it to something
like abs(e(ll))*1e-5. -svy: mlogit ... - does not report e(ll).

I would expect that the estimators, especially -svy-, would recognize
that the pseudo-likelihood will be of the order -e(N_pop)- rather than
-e(N)-, and hence the convergence criteria would be scaled
accordingly. Does this make sense? Since -svy- is aware that the
command it runs is a likelihood-based one (as evidenced by suppressed
-e(ll)- statistic), it would probably want to redefine the -nrtol()-,
or whichever option prevents the maximizer from declaring convergence.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: problems declaring convergence with weighted data?
  - From: Stas Kolenikov <[email protected]>

Prev by Date: Re: st: rectangulizing data
Next by Date: Re: st: Optimal RD Bandwidth Choice also for Rectangular Kernel?
Previous by thread: st: problems declaring convergence with weighted data?
Next by thread: st: Filling Missing Times/Dates and Corresponding Variable Entries
Index(es):
- Date
- Thread