Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Rare Events Regression


From   Robert Davidson <[email protected]>
To   [email protected]
Subject   st: Rare Events Regression
Date   Tue, 4 Mar 2014 09:46:33 -0500

Hello,

I am trying to estimate a model with about 20,000 observations but
only about 100 events (1s).  I have read about rare events models and
tried to implement 2 methods to deal with this issue, but I am having
slight trouble with both methods.

Method 1: take a random sample of the population of 0s and estimate a
hazard model using stcox (it is a panel dataset).  I apply pweights
based on the true probability of an event.  I repeat the process x
times (say 100) to ensure the results are not driven by a fluky draw,
and then can average the coefficients from each estimation.  The
problem I have here is that I am using the following to generate and
save the standard errors for the coefficients for each iteration:

matrix se=e(V)
matrix se2=vecdiag(cholesky(diag(vecdiag(se))))
matrix se2= vec(se2)

and I get errors that the matrix is not positive definite some times.
It is not because of missing variables/observations, so I am not sure
what is causing it.  Is there another way I can store the standard
errors inside a loop?  Or 'fix' the matrix to make it pos-def?


Method 2: use firthlogit to estimate a penalized maximum likelihood
regression.  This appears to deal with the bias created from having so
few events in your sample.  The problem I have here is that I cannot
seem to figure out how to cluster the standard errors by group (firm)
with this model and my observations are not independent of one
another.  Does anyone know how to do this?


I have also tried using relogit (seems inferior to firthlogit though)
and have looked at exlogistic and gevfit as well, though gevfit does
not seem appropriate for this test.


Thank you,
Robert Davidson
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index