# st: RE: more questions about changing the distribution

 From "Lachenbruch, Peter" To Subject st: RE: more questions about changing the distribution Date Tue, 18 Nov 2008 14:25:22 -0800

```I am confused about the intent of this message.  Forcing the distribution to be bimodal seems to be a consequence of the distribution.  Do you want a mixture of distributions?  I suspect I'm reacting to wording not exactly what I'm used to.  You seem to have a mixture of distributions.  Do you want to estimate the mixing parameter and the means and variances of the components?
Or is there something else here that I'm missing?

Tony

Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Linn Renée Naper
Sent: Tuesday, November 18, 2008 7:37 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: more questions about changing the distribution

As some of you probably already noticed, I am working on a
distribution of prices trying to force the distribution
into being bimodal (two price peaks instead of one).

Well, below is the codes I've been using so far.

sum mip
ret list

gen u = (mip - `r(mean)')/`r(sd)'
local p = -.3

local sd1 `r(sd)'
local sd2 0.9*`r(sd)'

local mu1 `r(mean)'
local mu2 1.1*`r(mean)'

gen e = u * cond(u < `p', `sd1', `sd2') + cond(u < `p', `mu1', `mu2')

Mip is the original price, and I am using this distribution
to generate a standardized variable u, which I then transform
into a new variable with a bimodal distribution.

My problem is that when imposing different means and sd for
the new distribution I very quickly seem to end up with a
"gap" in the distribution (intervals where no prices lie, obviously
Related to the defined p).
I want some distance between the two peaks (the two means defined).
In the example below I reduce sd2 with
10 percent and increases the mean2 by only 10 percent. Increasing
the mean by more results in a larger gap.

Here p=-0.3, which is equal to the p25 in the generated u. (meaning I want
25 percent of the sample to vary around the lower peak, this can
of course be changed as well).

I think maybe what I need is to impose a third condition for the
Observations for example between p25 and p50 to avoid having
the gap.
By looking at the codes, can anyone see how this is possible?
Or, maybe there is a better way to all this?

thanks
Linn

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```