Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: how to find the integral for a portion of a normal distribution.

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: how to find the integral for a portion of a normal distribution. Date Tue, 4 May 2010 16:03:33 -0400

```Buzz Burhans
My question here, technicalities aside, is why you think that the
normal distribution is a good fit to milk yields. A earlier post in
parameters has substantial probability below zero. This publication,
for what it's worth-I haven't read it, found a modified Weibull
distribution to be a good fit to milk
yields.((http://www.ncbi.nlm.nih.gov/pubmed/8046070) ) If you have
prior data, then -kdensity- will overlay the best fitting Normal on
top of a nonparametric estimated, and there are other diagnostic plots
in Stata (-help diagnostic plots-) You can fit parameters to the
Weibull and related distributions with -streg-. The Weibull is also
easy to simulate because there is a closed-form version of the
cumulative probability distribution.
Steve

On Tue, May 4, 2010 at 2:44 PM, Brian P. Poi <bpoi@stata.com> wrote:
> On Tue, 4 May 2010,
>
>>
>> Suppose I have an intervention applied to cows with a demonstrated mean
>> milk
>> yield response of +2.05 liters, sd 1.74.
>>
>> Suppose I am interested a 1 liter cut point.  I know how to find the
>> proportion of responses at or below the 1 liter response; and the
>> proportion
>> of responses at or above the one liter response.  This is what you are
>> doing
>> here, or it can be done with a z score.
>>
>> My question is, if the response is normally distributed, given n cows, how
>> many total liters were included or accumulated in the only responses below
>> 1
>> liter, and how many total liters were accumulated in only the responses at
>> or above 1 liter.  (Negative responses are possible). My interest is in
>> the
>> total liters, not the fraction of the total population either above or
>> below
>> the cutpoint.
>>
>
> Assuming I understand the question correctly, I think this does what you
> want.
>
> For a random variable distributed N(mu, sigma),
>
>   E[X | X < 1] = mu + sigma*lambda(alpha)
>
> where
>
>   alpha = (1 - mu) / sigma
>   lambda(alpha) = -phi(alpha) / Phi(alpha)
>
> Given a sample of size n, the number of observations less than 1 is
>
>   numltone = n*Phi(alpha)
>
> so the total amount of milk given that X < 1 is
>
>   milk = numltone*E[X | X < 1]
>
> In Stata,
>
> . clear all
> . set mem 200m
> (204800k)
> . set seed 1
> . drawnorm x, means(2.05) sds(1.74) n(10000000) clear
> (obs 10000000)
> . summ x if x < 1, mean
> . di r(sum)
> -188292.37
>
> . scalar alpha = (1 - 2.05) / 1.74
> . scalar lambda = -1*normalden(alpha) / normal(alpha)
> . scalar condmean = 2.05 + 1.74*lambda
> . scalar numltone = 10000000*normal(alpha)
> . scalar condtotal = condmean*numltone
> . di condtotal
> -187432.23
>
> Similarly,
>
>   E[X | X > 1] = mu + sigma*lambda'(alpha)
>
> where
>
>   lambda' = phi(alpha) / (1 - Phi(alpha))
>
> In Stata,
>
> . scalar lambda2 = normalden(alpha) / (1 - normal(alpha))
> . scalar condmean2 = 2.05 + 1.74*lambda2
> . scalar condtotal2 = condmean2*(10000000 - numltone)
> . di condtotal2
> 20687432
>
> . sum x if x > 1, mean
> . di r(sum)
> 20685854
>
>
>  -- Brian Poi
>  -- bpoi@stata.com
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax: 206-202-4783

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```