[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Shehzad Ali" <sia500@york.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: probit questions |

Date |
Wed, 25 Jun 2008 17:53:27 +0100 |

This is an interesting discussion on wald-chi2 statistics. I just thought to share a related point. I had a similar problem when I was using cluster sampling weights (option -cluster-) in my probit model. Data has 25 clusters and about the same number variables with 900 observations in total. I wonder if the wald-statistics issue also arises when the number of clusters and variables are almost the same? Cheers, Shehzad -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Verkuilen, Jay Sent: 25 June 2008 17:35 To: statalist@hsphsun2.harvard.edu Subject: st: RE: probit questions Sun, Yan (IFPRI) wrote: >I have couple of questions about the Probit model. My dependent variable is a 0/1 binary choice (1=invest in technology, 0=no investment) for user groups, independent variables are user groups' characteristics (around 20). >1) Which model is correct one: Probit or Logit? What is the STATA command for checking this? Unless you have very large samples (which you don't), they are nearly indistinguishable. In general there is reason to prefer logit to probit when you have potentially extreme probabilities. The logistic distribution is very much like a t with 10 df in shape. The classic example of being able to tell the difference appears in chess ranking. The Elo system is, essentially, based on logistic regression. It was originally based on probit but in practice it turned out that the probit didn't make enough extreme predictions. >2) I have small observations (total 170 observations, but valid obs. Is only around 60 for all independent >variables), sometimes the regression does not report report "wald chi2" statistics. What is the reason for this? >3) I got a note after right after the regression, which says "8 failures and 7 successes completely determined", >what does this means? Simply put you have too many independent variables for your sample. It sounds like you may have some missing data as well, since the number of valid observations is much smaller than the number of observations. The standard errors and Wald statistics failing is one sign. The perfect predictions is another. You need to deal with the missing data (-findit ice-) and even then, you have WAY too many independent variables for 170 observations. Very roughly speaking, you should have 10 observations per variable, and probably more for binary data, which don't have that much information per observation. Either get more data or get rid of variables. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ No virus found in this incoming message. Checked by AVG. Version: 8.0.101 / Virus Database: 270.4.1/1518 - Release Date: 25/06/2008 09:46 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: probit questions***From:*"Sun, Yan (IFPRI)" <Y.SUN@CGIAR.ORG>

**st: RE: probit questions***From:*"Verkuilen, Jay" <JVerkuilen@gc.cuny.edu>

- Prev by Date:
**st: RE: RE: calculating means by group, with weights** - Next by Date:
**Re: st: RE: RE: calculating means by group, with weights** - Previous by thread:
**st: RE: probit questions** - Next by thread:
**Re: st: probit questions** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |