Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Modelling extremely rare events (binary)

From	DE SOUZA Eric <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: st: Modelling extremely rare events (binary)
Date	Tue, 14 Jun 2011 11:37:13 +0200

He has a whole page devoted to it:
http://gking.harvard.edu/category/research-interests/methods/rare-events


Eric de Souza
College of Europe
Brugge (Bruges), Belgium
http://www.coleurope.eu


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Abhimanyu Arora
Sent: 14 June 2011 11:29
To: [email protected]
Subject: Re: st: Modelling extremely rare events (binary)

Hi
Perhaps you could have a look at Gary Kings's -relogit-?
Best
Abhimanyu

On Tue, Jun 14, 2011 at 10:05 AM, Markus Eberhardt <[email protected]> wrote:
> Hello everybody
>
> I have an empirical problem where for a very large dataset (panel, 
> around 20,000 panel members with over 60,000 observations) I have two 
> binary outcome variables A and B. The occurrence of either is 
> extremely rare: only about 1.5% and 0.1% of observations for A and B 
> respectively. I am for the time being treating this as a pooled panel, 
> so not accounting for any fixed effects at the panel member level. My 
> empirical model is made up of continuous and binary variables. In the 
> logit and probit I am estimating A and B separately, for biprobit 
> jointly, for mlogit I have four categories (0, A occurrs, B occurrs, 
> both occurr). Ideally the analysis does account for the jointess of 
> the decision as in the biprobit and mlogit approaches.
>
> Here are my questions:
> (1) DOES THIS AT ALL MAKE SENSE? Having estimated logit, probit, 
> bivariate probit and multinomial logit I am concerned about the 
> viability of what I am doing to this data: given the minute share of 
> actual events occurring (1s, rather than 0s) is it at all possible 
> that a logit-type model could tell me anything meaningful? So far I am 
> getting interpretable empirical results, but it was put to me that 
> these were entirely unreliable (or even spurious) given the extreme 
> rarety of the event. Note that there are strong priors (from the 
> descriptive analysis) that a certain characteristic (binary) drives 
> the outcomes, so I imagine that a fixed effect and/or an interaction 
> of this binary characteristic with other (continuous) RHS variables 
> may provide an intuitive 'fit', but I am unsure whether this is 
> empirically satisfied.
> (2) USEFUL DIAGNOSTICS? My diagnostics for the model(s) are hampered 
> by the fact that it's difficult to get a handle on what constitutes a 
> substantial deviation for the predicted from the observed outcomes.
> Apart from -fitstat- type diagnostics, are there any other things I 
> could do to chose between rival models and/or to convince myself that 
> what I'm doing is at all meaningful in this challenging empirical 
> case?
> (3) ALTERNATIVE EMPIRICAL MODELS? Are there any other empirical 
> specifications that are better suited to fit this data? I tried to 
> search for extremely rare events such as earthquakes, but couldn't get 
> much out of it.
> (4) PANEL ELEMENT? Possibly a bridge too far, but would there be any 
> option to get the panel element of the data to have a bearing on the 
> empirics.
>
> Thanks a lot in advance.
> markus
>
> Markus Eberhardt
> ESRC Post-doctoral Research Fellow, Centre for the Study of African 
> Economies, Department of Economics, University of Oxford Stipendiary 
> Lecturer, St Catherine's College, Oxford
>
> web: http://sites.google.com/site/medevecon/home
> email: [email protected]
> twitter: http://twitter.com/sjoh2052
> mail: Centre for the Study of African Economies, Department of 
> Economics, Manor Rd, Oxford OX1 3UQ, England
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Modelling extremely rare events (binary)
  - From: Markus Eberhardt <[email protected]>

References:
- st: Modelling extremely rare events (binary)
  - From: Markus Eberhardt <[email protected]>
- Re: st: Modelling extremely rare events (binary)
  - From: Abhimanyu Arora <[email protected]>

Prev by Date: st: Re: Analyzing time series data on prices by districts & markets
Next by Date: Re: st: Modelling extremely rare events (binary)
Previous by thread: Re: st: Modelling extremely rare events (binary)
Next by thread: Re: st: Modelling extremely rare events (binary)
Index(es):
- Date
- Thread