Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Questions for random data generation and value label


From   Joerg Luedicke <joerg.luedicke@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Questions for random data generation and value label
Date   Mon, 11 Mar 2013 16:45:10 -0400

You are still not saying which distribution you would like to sample
from! Any sample must be from _some_ distribution.

Joerg

On Mon, Mar 11, 2013 at 4:28 PM, Yu Xue <snowrain@gmail.com> wrote:
> Thanks Maarten, David, Nick, Joerg !
>
> Let me use an example to describe my question more clearly.
>
> There is an actual data that has three variables: Var1, Var2, Var3.
> Each of them has continuous numeric values. And I get the max, min,
> SD, mean for each of them, and save them in several macros, and then
> clear the memory.
>
> Then, I want to generate a synthetic data, which also include three
> variables: SynVar1, SynVar2, SynVar3. And they keep the same max, min,
> SD, mean  of Var1, Var2, Var3, respectively as in actual data.
>
> Hope I describe it clearly.
> Thank you very much
>
>
> On Mon, Mar 11, 2013 at 12:48 PM, Joerg Luedicke
> <joerg.luedicke@gmail.com> wrote:
>> The normal distribution has support -infinity,+infinity, so it is not
>> clear what you mean with 'range' here. Do you want to draw from a
>> truncated normal distribution?
>>
>> Joerg
>>
>> On Mon, Mar 11, 2013 at 12:49 PM, Yu Xue <snowrain@gmail.com> wrote:
>>> Thanks Maarten!
>>>
>>> What I want is Normal Distribution. Is there a way to randomly
>>> generate a variable with specific mean, SD, and range,
>>>
>>> Thanks!!
>>> Mark
>>>
>>> On Mon, Mar 11, 2013 at 10:35 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>>>> On Mon, Mar 11, 2013 at 4:20 PM, Yu Xue wrote:
>>>>> I already checked "-help random_number_functions-", but I still can
>>>>> not find the answer to my question.
>>>>>
>>>>> I knew that I can use a formula similar like this:
>>>>> Var=a+int((b-a+1)*runiform()), to keep a specific range in [a,b]
>>>>> and use another formula: Var=invnorm(uniform())*SD+mean, to keep
>>>>> specific Standard deviation and mean.
>>>>> But I do not know how to generate a "Var" with all specific range, SD, and mean.
>>>>> Please note that I do not generate a sample data from the actual data,
>>>>> what I want to generate is synthetic data (totally fake data).
>>>>
>>>> What distribution do you want to draw your new variable from? Do you
>>>> want it to be normally (Gaussian) distributed, gamma distributed, beta
>>>> distribed, Fisk distributed, Laplace distributed, ... The number of
>>>> choices is huge, but without choosing your distribution you cannot
>>>> draw your random numbers.
>>>>
>>>> -- Maarten
>>>>
>>>>
>>>> ---------------------------------
>>>> Maarten L. Buis
>>>> WZB
>>>> Reichpietschufer 50
>>>> 10785 Berlin
>>>> Germany
>>>>
>>>> http://www.maartenbuis.nl
>>>> ---------------------------------
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index