Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: tricks to speed up -xtmelogit- |

Date |
Tue, 21 Dec 2010 15:28:23 -0500 |

Hi, Jeph, very interesting problem. Are the 150 variables related? E.g. are these 150 a single group of dummies? Or are they all independent: height/age/gender? With 6mln observations there is some chance you will have some duplicates, which may give you a possibility to reduce your sample a bit (just adjust the weights). Given the rareness of your outcome taking a simple subsample may yield just a few positives in the subsample. May I suggest also to consider taking all positives and a random subsample of negatives, estimate the candidate and then run the full sample on that? Finally, this command is not in the MP report, but have you investigated how does it perform as N(CPU) grows? Best regards, Sergiy On Tue, Dec 21, 2010 at 2:15 PM, Jeph Herrin <stata@spandrel.net> wrote: > All, > > I am trying to estimate a series of models using 6 million observations; > the observations are nested within 3000 groups, and the dichotomous > outcome is somewhat rare, occurring in about 0.5% of observations. > There are about 150 independent variables, and so my basic model looks > like this: > > . xtmelogit Y x1-x150 || group: > > This took approximately 3 weeks to converge on a high end machine > (3.2GHz, Intel Core i7, 24GB RAM). I saved the estimation result > > . est save main > > but now would like to estimate some related models of the form > > . xtmelogit Y x1-x150 z1 z2 || group: > > and would like to think I can shave some considerable time off the > estimation using the prior information available. I tried > > . est use main > . matrix b = e(b) > . xtmelogit Y x1-x150 z1 z2 || group:, from(b) refineopts(iterate(0)) > > but this gave me an error that the likelihood was flat and nothing > proceed. So I've thought of some other approaches, but am not sure what > I expect to be most efficient, and would prefer not to spend weeks > figuring it out. > > One idea was to use a sample, estimate the big model, and then use > that as a starting point: > > . est use main > . matrix b = e(b) > . gen byte sample = (uniform()*1000)<1 > . xtmelogit Y x1-x150 z1 z2 if sample || group:, from(b) > . matrix b = e(b) > . xtmelogit Y x1-x150 z1 z2 || group:, from(b) refineopts(iterate(0)) > > Another was to first use Laplace iteration, and start with that result: > > . est use main > . matrix b = e(b) > . xtmelogit Y x1-x150 z1 z2 if sample || group:, from(b) laplace > . matrix b = e(b) > . xtmelogit Y x1-x150 z1 z2 || group:, from(b) refineopts(iterate(0)) > > I'd appreciate any insight into which of these approaches might shave > a meaningful amount of time off of getting the final estimates, or if > there is another that I could try. > > thanks, > Jeph > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: tricks to speed up -xtmelogit-***From:*Jeph Herrin <stata@spandrel.net>

**Re: st: tricks to speed up -xtmelogit-***From:*Stas Kolenikov <skolenik@gmail.com>

**References**:**st: tricks to speed up -xtmelogit-***From:*Jeph Herrin <stata@spandrel.net>

- Prev by Date:
**Re: st: XTReg re/mle clustered & weighted models** - Next by Date:
**Re: st: tricks to speed up -xtmelogit-** - Previous by thread:
**st: tricks to speed up -xtmelogit-** - Next by thread:
**Re: st: tricks to speed up -xtmelogit-** - Index(es):