[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: more questions about changing the distribution

From   "Lachenbruch, Peter" <>
To   <>
Subject   st: RE: more questions about changing the distribution
Date   Tue, 18 Nov 2008 14:25:22 -0800

I am confused about the intent of this message.  Forcing the distribution to be bimodal seems to be a consequence of the distribution.  Do you want a mixture of distributions?  I suspect I'm reacting to wording not exactly what I'm used to.  You seem to have a mixture of distributions.  Do you want to estimate the mixing parameter and the means and variances of the components? 
Or is there something else here that I'm missing?


Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001

-----Original Message-----
From: [] On Behalf Of Linn Renée Naper
Sent: Tuesday, November 18, 2008 7:37 AM
Subject: st: more questions about changing the distribution

As some of you probably already noticed, I am working on a 
distribution of prices trying to force the distribution 
into being bimodal (two price peaks instead of one).

Well, below is the codes I've been using so far. 

	sum mip 
	ret list

	gen u = (mip - `r(mean)')/`r(sd)' 
	local p = -.3

	local sd1 `r(sd)'
	local sd2 0.9*`r(sd)'

	local mu1 `r(mean)'
	local mu2 1.1*`r(mean)'

	gen e = u * cond(u < `p', `sd1', `sd2') + cond(u < `p', `mu1', `mu2')

Mip is the original price, and I am using this distribution 
to generate a standardized variable u, which I then transform 
into a new variable with a bimodal distribution.

My problem is that when imposing different means and sd for 
the new distribution I very quickly seem to end up with a 
"gap" in the distribution (intervals where no prices lie, obviously 
Related to the defined p).
I want some distance between the two peaks (the two means defined). 
In the example below I reduce sd2 with
10 percent and increases the mean2 by only 10 percent. Increasing
the mean by more results in a larger gap.

Here p=-0.3, which is equal to the p25 in the generated u. (meaning I want
25 percent of the sample to vary around the lower peak, this can
of course be changed as well). 

I think maybe what I need is to impose a third condition for the
Observations for example between p25 and p50 to avoid having 
the gap. 
By looking at the codes, can anyone see how this is possible?
Or, maybe there is a better way to all this?


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index