 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: prob by using binreg or logit

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject st: RE: prob by using binreg or logit Date Fri, 14 Jun 2013 11:48:06 -0400

```But you shouldn't use -binreg- with survey data. Instead, you should
Use one of:

• svy: logit  or svy: logistic
• svy: glm with link(log) and family(binomial) options
• svy: regress, with 0-1 indicators, for risk differences

If you don't use these, then standard errors, p-values, and CIs will be
incorrect.

I use -margins- to compare grouped and predicted results on the probability scale.
This is one way of deciding which method to use. Another is to look at ROCs (below):

After -svy: logit- or -logistic-, you can test goodness of fit with the contributed
program -svylogitgof- ("findit")

If you intend to predict outcomes after -svy: logit-, you'll need receiver operating characteristics (ROCs).
To get these most easily, use plain, non-survey logit with frequency weights that agree with the
probability weights to the nearest integer:

. gen new_wt = round(old_weight,1)
. logit....    [fw = new_wt]
. lroc
. lsens

Steve

On Jun 13, 2013, at 4:06 AM, tshmak wrote:

Dear Carsten,

There certainly are differences. rr stands for risk ratio. or stands for odds ratio. I assume you know the difference between "odds" and "risk". -binreg- with the rr option assumes that the log of the risk (or probability) is a linear function of the covariates. -logit-, or binreg with the -or- option, assumes that the log of the odds is a linear function of the covariates. That should be enough to lead to differences.

HTH,
Tim

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of carsten hinrichsen
Sent: 12 June 2013 23:14
To: statalist@hsphsun2.harvard.edu
Subject: st: prob by using binreg or logit

Dear statalisters,

I am working with survey data and want to find the probability of participation by analyzing the binary variable of participation (yes/no).

As far as I know, I could use binreg with the rr option
or
I could use logit and use the odds to calculate the probability.

I've tried both and get slightly different results. I've been looking through the stata help but can't figure out what the difference is between these to methods.
So I'm wondering are there different assumptions behind these to methods that I should take into consideration?
And should I prefer one of the methods to the other?

Any help is appreciated.

Kind regards
Carsten Hinrichsen
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```