Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Fwd: stata question

From   Maarten Buis <>
Subject   st: Fwd: stata question
Date   Thu, 30 May 2013 09:58:45 +0200

---Bernard Alex wrote me privately:
> I found your contact on Statalist, read about you on your
> website and thought you can easily help me get through
> an exercise I need to perform.

That is not the way Statalist works. Questions should not be sent to
its individual members but to the list. There are very good reasons
for that that are listed here:

> I have the size distribution of a group of firms. I know size
> is not normally distributed; the distribution is skewed and
> both left and right truncated (50-499 employees).
> Now, what I would like to get is the underlying data of size
> assuming it was normally distributed (the green line in the graph).
> Accordingly I tried to get the normal distribution given the true
> mean and SD: gen normala=invnorm(uniform())*97.27415+146.2396
> 97.27415 is the standard deviation and 146.2396 the mean.
> Question: is it possible to impose the support of the distribution?
> I would like to have the normal distribution between the two
> extremes 50 and 499.

That would mean you want to fit a truncated normal distribution to
your data and sample from that distribution. You can fit the
parameters of that distribution (for fixed truncation points, in your
case ll(50) and ul(499)) using -truncreg-. Than you san sample from
that distribution like in the example below:

*------------------ begin example ------------------
sysuse nlsw88, clear

// find the mean and standard deviation for the
// non-truncated normal
truncreg wage, ll(2) ul(40)

tempname mu sigma alpha beta diff
scalar `mu'    = _b[_cons]
scalar `sigma' = [sigma]_b[_cons]
scalar `alpha' = normal(( 2 - `mu') / `sigma')
scalar `beta'  = normal((40 - `mu') / `sigma')
scalar `diff'  = `beta' - `alpha'

// create 19 simulated variables from this distribution:
forvalues i = 1/19 {
    gen sim`i' = invnormal(                  ///
                 `alpha' + runiform()*`diff' ///
                 )*`sigma' + `mu'

// compare observed distribution with simulated distribution
local opts "sort lpattern(solid) lcolor(gs8)"
forvalues i = 1/19 {
    cumul sim`i', gen(c`i')
    local graph "`graph' line c`i' sim`i', `opts' ||"
cumul wage, gen(c)
twoway `graph' scatter c wage, msymbol(oh) ///
       legend(order(20 "observed" 1 "simulated"))
*------------------- end example -------------------
(For more on examples I sent to the Statalist see: )

I hope you can help me follow-through.

Thank you in advance for your patience and any other suggestion

Best regards,


Alex Bernard

Junior Analyst - Ufficio Studi

Mediobanca - Banca di Credito Finanziario S.p.A.

Piazzetta E. Cuccia, 1 - 20121 Milano

tel.   +39 02 8829.689

fax    +39 02 8829.706

Learn more at

Company website

* * * * * * * * * * * * * * * * * *

Questa comunicazione ha carattere riservato ed è coperta da segreto
bancario; le informazioni in essa contenute non possono essere in
nessun modo rivelate o diffuse. Qualora non siate i destinatari della
comunicazione Vi preghiamo di avvertirci con sollecitudine e di
inviare per posta la comunicazione che avete ricevuto.


* * *

This message is of a confidential nature and is covered by Italian
banking secrecy. The information contained herein may not be disclosed
to third parties or disseminated in any way. If you are not the
intended recipient, please immediately notify us.

Thank you.

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index