Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: prob by using binreg or logit

From   Steve Samuels <>
Subject   st: RE: prob by using binreg or logit
Date   Fri, 14 Jun 2013 11:48:06 -0400

But you shouldn't use -binreg- with survey data. Instead, you should
-svyset- with the design information: weights, clusters, strata.
Use one of:

• svy: logit  or svy: logistic
• svy: glm with link(log) and family(binomial) options
• svy: regress, with 0-1 indicators, for risk differences

If you don't use these, then standard errors, p-values, and CIs will be

I use -margins- to compare grouped and predicted results on the probability scale.
This is one way of deciding which method to use. Another is to look at ROCs (below):

After -svy: logit- or -logistic-, you can test goodness of fit with the contributed
program -svylogitgof- ("findit")

If you intend to predict outcomes after -svy: logit-, you'll need receiver operating characteristics (ROCs).
To get these most easily, use plain, non-survey logit with frequency weights that agree with the
probability weights to the nearest integer:

. gen new_wt = round(old_weight,1)
. logit....    [fw = new_wt]
. lroc
. lsens


On Jun 13, 2013, at 4:06 AM, tshmak wrote:

Dear Carsten, 

There certainly are differences. rr stands for risk ratio. or stands for odds ratio. I assume you know the difference between "odds" and "risk". -binreg- with the rr option assumes that the log of the risk (or probability) is a linear function of the covariates. -logit-, or binreg with the -or- option, assumes that the log of the odds is a linear function of the covariates. That should be enough to lead to differences. 


-----Original Message-----
From: [] On Behalf Of carsten hinrichsen
Sent: 12 June 2013 23:14
Subject: st: prob by using binreg or logit

Dear statalisters,

I am working with survey data and want to find the probability of participation by analyzing the binary variable of participation (yes/no).

As far as I know, I could use binreg with the rr option
I could use logit and use the odds to calculate the probability.

I've tried both and get slightly different results. I've been looking through the stata help but can't figure out what the difference is between these to methods. 
So I'm wondering are there different assumptions behind these to methods that I should take into consideration?
And should I prefer one of the methods to the other?

Any help is appreciated.

Kind regards 
Carsten Hinrichsen 		 	   		  
*   For searches and help try:

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index