# st: simulating data from a linear probability model

 From Alexander Tsai <[email protected]> To [email protected] Subject st: simulating data from a linear probability model Date Tue, 28 Oct 2003 11:41:20 -0500

```Dear Statalisters,

I would like to simulate data for a linear probability model with
response variable y, regressor(s) x, and known coefficients a and b.

If I wanted to simulate data from a logistic model, I could follow the
procedure suggested helpfully by Al Feiveson on this listserv (Nov
11'02):

generate z = a + b*x
generate p = exp(z)/(1+exp(z))
generate y = uniform()<=p

But I'm stumped as to how to go about simulating data from a LPM. I
can't simply draw a random error term to generate the response variable
y because of the heteroskedasticity problem. For efficient estimation of
the LPM, Goldberger suggests a weighted least squares procedure that
involves (1) estimating by OLS, (2) computing yhat(1-yhat), (3) using
weighted least squares with the weights w=sqrt(yhat(1-yhat)], and (4)
regressing y/x and x/w.

Have any other Statalisters encountered this problem before?

Many thanks,
Alex

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```