Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: RE: risk ratio

 From jhilbe@aol.com To statalist@hsphsun2.harvard.edu Subject st: RE: risk ratio Date Sun, 21 Mar 2010 10:51:47 -0400

In response to the StataLister asking about creating a synthetic binary response model that
```can be used to estimate a relative risk ratio:

```
I have an article coming out in the next Stata Journal that details how to create synthetic models for a wide variety of discrete response regression models. For your problem though, I think that the best approach is to create a synthetic binary logistic model with a single predictor - as you specified. Then model the otherwise logistic data as Poisson with a robust variance estimator. And the coefficient must be exponentiated. It can be interpreted as a relative
```risk ratio.

```
Below is code to create a simple binary logistic model. Then model as mentioned above. You asked for a continuous pseudo-random variate, so I generated it from a normal distribution. I normally like to use pseudo-random uniform variates rather normal variates when creating these types of models, but it usually makes little difference. Recall that without a seed the model results will differ each time run. If you want the same results, pick a seed. I used my birthday.
```
I hope that this is what you were looking for.

Joseph Hilbe

*  intercept = 2;  Beta for X1=0.75
clear
set obs 50000
set seed 1230
gen x1 = invnorm(runiform())
gen xb = 2 + 0.75*x1
gen exb = 1/(1+exp(-xb))
gen by = rbinomial(1, exb)
glm by x1, nolog fam(bin 1)
glm by x1, nolog fam(poi) eform robust

. glm by x1, nolog fam(bin 1)
```
Generalized linear models No. of obs = 50000 Optimization : ML Residual df = 49998
```                                                 Scale parameter = 1
```
Deviance = 37672.75548 (1/df) Deviance = .7534852 Pearson = 49970.46961 (1/df) Pearson = .9994494
```Variance function: V(u) = u*(1-u)                  [Bernoulli]
Link function    : g(u) = ln(u/(1-u))              [Logit]
```
AIC = .7535351 Log likelihood = -18836.37774 BIC = -503294.5
```-------------------------------------------------------------------------
-----
|                 OIM
```
by | Coef. Std. Err. z P>|z| [95% Conf. Interval]
```-------------+-----------------------------------------------------------
-----
```
x1 | .7534291 .0143134 52.64 0.000 .7253754 .7814828 _cons | 1.993125 .0149177 133.61 0.000 1.963887 2.022363
```-------------------------------------------------------------------------
-----

. glm by x1, nolog fam(poi) eform robust
```
Generalized linear models No. of obs =50000 Optimization : ML Residual df = 49998
```                                                 Scale parameter = 1
```
Deviance = 12673.60491 (1/df) Deviance = .2534822 Pearson = 7059.65518 (1/df) Pearson = .1411988
```Variance function: V(u) = u                        [Poisson]
Link function    : g(u) = ln(u)                    [Log]
```
AIC = 1.970592 Log pseudolikelihood = -49262.80246 BIC = -528293.7
```-------------------------------------------------------------------------
-----
|               Robust
```
by | IRR Std. Err. z P>|z| [95% Conf. Interval]
```-------------+-----------------------------------------------------------
-----
```
x1 | 1.104476 .0021613 50.78 0.000 1.100248 1.10872
```-------------------------------------------------------------------------
-----
.
Tomas Lind wrote:
```
Does anyone know how to generate fake data for a dichotomous outcome (0, 1) that is dependent on a continuous exposure variable in an epidemiological
```relative risk context. I know how to use the logit transformation but in
that case exposure is proportional to log(ods) and not to risk.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```