Home  /  Products  /  Stata 14  /  Statistical distribution functions

New statistical distribution functions were introduced in Stata 14.

See the new features in Stata 18.

New statistical distribution functions

New statistical distribution functions


Highlights

  • 42 new statistical functions for 5 distribution families
    • exponential
    • logistic
    • Weibull
    • Weibull as proportional hazards (PH)
    • Wishart
  • New functions let you compute
    • probabilities
    • cumulative distribution functions
    • densities
    • and more
  • 4 new noncentral and logarithmic statistical functions
    • inverse cumulative noncentral Student's t
    • inverse cumulative noncentral F
    • natural logarithm of the multivariate normal density
    • natural logarithm of the inverse gamma density
  • New random-number generators for 4 statistical distributions
    • exponential
    • logistic
    • Weibull
    • Weibull as proportional hazards (PH)
  • You no longer have to remember a formula to get
    random numbers in an interval!

Let's see it work

We will start with that last bullet point, because while the demonstration is simple, if you frequently generate uniform random numbers over a range (or perhaps if you do it infrequently and have a poor memory), this will save you a lot of time.

Let's say we want to generate a random number that is uniformly distributed over an interval, say, (1,7). Back in the old days, we would have to do this with a formula.

Stata 14 introduces two new functions for uniform random numbers: runiform(a,b) and runiformint(a,b). runiform(a,b). Now, all we need to do is type

generate newvar = runiform(1,7)

runiformint(a,b) is used to obtain random integers over the interval [a,b]. It replaces the old method of typing a+int((b-a+1)*runiform()). Results differ slightly because runiformint(a,b) is more precise.

Now, let's take a look at just a couple of possible uses for the statistical distribution functions: simulation and visually comparing different survivor functions.

We want to simulate some survival data and compare our fitted results with the simulated data. This is possible with any of the new random-number generators for survival families, but we are going to demonstrate it for the Weibull(5,3) distribution.

First, we generate 100 observations.

. set obs 100

. set seed 31045

. generate y = rweibull(5,3)

Next, we will fit the model by using streg. We specify coeflegend because we need to know how to refer to the estimated parameters later.

. stset y

     failure event:  (assumed to fail at time=y)
obs. time interval:  (0, y]
 exit on or before:  failure

100 total observations
0 exclusions
100 observations remaining, representing
100 failures in single-record/single-failure data
279.762 total analysis time at risk and under observation
at risk from t = 0
earliest observed entry t = 0
last observed exit t = 4.414602
. streg, distribution(weibull) time coeflegend failure _d: 1 (meaning all fail) analysis time _t: y Fitting constant-only model: (output omitted) Fitting full model: Iteration 0: log likelihood = 10.578599 Weibull regression -- accelerated failure-time form No. of subjects = 100 Number of obs = 100 No. of failures = 100 Time at risk = 279.761945 LR chi2(0) = -0.00 Log likelihood = 10.578599 Prob > chi2 = .
_t Coef. Legend
_cons 1.110561 _b[_t:_cons]
/ln_p 1.65018 _b[ln_p:_cons]
p 5.207916
1/p .1920154

The Weibull distribution has a shape parameter, a, and a scale parameter, b. We can obtain the estimated values of these parameters by exponentiating streg’s estimates of _cons and ln_p. We use local macros to store these values and the mean of the distribution.

. local a = exp(_b[ln_p:_cons])

. local b = exp(_b[_t:_cons])

. local mean : display %5.2f `b'*exp(lngamma(1+1/`a'))

. display "Fitted Weibull distribution:" 
    _newline  "shape" _skip(10) "scale" _skip(10) "mean" 
    _newline %5.2f `a' _skip(10) %5.2f `b' _skip(10) `mean'

Fitted Weibull distribution:
shape          scale          mean
 5.21           3.04          2.79

Using these estimated parameters, the true parameters we used to simulate the data, and the new weibullden() function, we can plot our fitted results and the true values with twoway. We also add the true mean 2.75.

We could type

. twoway function y = weibullden(5  ,3  ,x), range(1 5) ||
         function y = weibullden(`a',`b',x), range(1 5)

to graph our true and estimated densities. Or we could add a few graph options and produce

graph

The new distribution functions are also useful for understanding relationships between different statistical families.

We can see how survivor functions for various distributions relate to each other. Recall that the survivor function is 1 minus the cumulative distribution function, S(t) = 1 - F(t).

We plot the survivor function that corresponds to our Weibull(5,3). We add a Weibull(3,3) and Weibull(1,3). To obtain the CDF of the Weibull distribution, we use weibull(a,b). We are also going to plot an exponential(3) with a thin line. You will see that it falls entirely over the Weibull(1,3) because the Weibull(1,b) is equal to the exponential(b). We use exponential(b) to get the CDF of the exponential distribution. Again, subtracting it from one to obtain the corresponding survivor function.

graph

Here is what we typed to obtain that graph.

twoway function w = 1-weibull(5,3,x),   range(0 5) lwidth(thick) ||
       function y = 1-weibull(3,3,x),   range(0 5) lwidth(thick) ||
       function y = 1-weibull(1,3,x),   range(0 5) lwidth(thick) ||
       function y = 1-exponential(3,x), range(0 5) lwidth(thin) color(yellow)
       legend( label(1 "Weibull(5,3)")
               label(2 "Weibull(3,3)")
               label(3 "Weibull(1,3)")
               label(4 "exponential(3)")
               cols(1) ring(0) position(7))
       xtitle("t")
       title(Survivor Functions)

The first four lines use the distribution functions; the rest is just about getting the graph to look the way we wanted.

Tell me more

To find out more about all of Stata’s random-number and statistical distribution functions, see the new 157-page Stata Functions Reference Manual. You can find tips for working with the functions, means and variances of different distributions, and more.

Upgrade now Order Stata