Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: request for help - multi-level modelling with a big dataset usingxtlogit


From   Chris Bojke <cb23@york.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: request for help - multi-level modelling with a big dataset usingxtlogit
Date   Fri, 19 Jul 2002 14:22:24 +0100

Dear Bernadette

Stata 7 SE can have a matsize up to 11,000 which is closer to your
needs.  If you can't find anyone nearer to home with the bigger version,
e-mail the data to me and i can have a quick go for you if you like.

Alternatively (and I'm not sure whether this will have an effect on
matsize needed) try using fixed effects or  altering the number of
points of support (currently 8) for the Gaussian Quadarature.

Chris




"Alves, Bernadette" wrote:
> 
> I'm a student looking for help with my MSc dissertation looking at factors
> associated with delivery by caesarean section. It's an analysis of a
> database of about half a million records of women who gave birth in
> hospital.   I am using logistic regression and because my data are naturally
> grouped, I'm using a multi-level approach to take account of the correlation
> between women in the same hospital.  I am therefore using xtlogit (rather
> than logit).   I find that I cannot run xtlogit with my entire 500,000
> records - stata comes back with an error saying that it needs to be able to
> set matsize to approximately 18,000.  Unfortunately the matsize limit for
> stata 7.0 is 800.
> 
> I then took a 4% sample (approximately 20,000 records ) which is the largest
> that stata can cope with at a matsize of 800.  But, and here's the weird
> thing that I need help with.... The parameter estimates are very dependent
> on the sample I take. Sometimes I get a p-value of 0.05, for other samples I
> get a p-value of 0.7.  Here's an example of what I do to test whether
> xdelmid is a predictor of emergency caesarean section.
> 
>         sample 4  /* this give me the 4% sample */
> 
>         xi: xtlogit emerg i.gestat i.age i.xdelmid, pa corr(exch) robust
> i(provid)
> 
>         testparm _Ixdel*  /* this does a wald test on xdelmid */
> 
> Taking 10 different 4% sample, I find my estimates differ considerably and
> my p-values range from 0.04 to 0.71.
> 
> Why can't stata cope with the full dataset and why are the parameter
> estimates so sensitive to the sample taken?
> 
> I would be extremely grateful if someone could help me with this.
> 
> Bernadette
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
begin:vcard 
n:Bojke;Chris
tel;cell:0795 818 3140
tel;fax:(01904) 432700
tel;work:(01904) 432694
x-mozilla-html:FALSE
org:National Primary Care Research & Develpoment Centre;Centre for Health Economics
version:2.1
email;internet:cb23@york.ac.uk
title:Research Fellow
adr;quoted-printable:;;Room D146=0D=0AAlcuin College=0D=0AUniversity of York;York;;YO10 5DD;United Kingdom
fn:Chris Bojke
end:vcard



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index