# Re: st: Drawing from a known, non-regular, discrete distribution

 From Lulu Zeng To "statalist@hsphsun2.harvard.edu" Subject Re: st: Drawing from a known, non-regular, discrete distribution Date Wed, 19 Feb 2014 19:51:49 +1100

```Dear Nick,

Sorry that the (1..10)' in my example was a typo, I in fact used 1200
instead of 10 in my real experiment. It didn't work despite so. I also
scaled "share" before calling meta, same error occurs.

Also, by using -rdiscrete()-, I can see it draws a random number
according to a distribution specified by "p" (and write the random
draws into "odo2" using -st_store()- in my case), but I don't
understand how -rdiscrete()- could draw from a given set of values
(e.g., a pre-specified "odo2" -- this is really what I'm trying to do)

My apologies if the answer to my question is straight forward, I am
quite new to Meta.

Best Regards,
Lulu

On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> In my example, I have 10 probabilities in observations 1 to 10 of the
> data, so use
> (1..10)' as an argument. That will make sense for you if and only if
> Nick
> njcoxstata@gmail.com
>
>
> On 19 February 2014 00:09, Lulu Zeng <luluzengnz@gmail.com> wrote:
>> Dear Nick,
>>
>> Thank you for your suggestion. I must have done something incorrectly
>> so mata still gives me the below error despite I did use -p :/ sum(p)-
>> for rescaling as you suggested (I also tried to rescale the original
>> probability variable but neither worked):
>>
>> sum of the probabilities must be 1
>>              rdiscrete():  3300  argument out of range
>>                  <istmt>:     -  function returned error
>> r(3300);
>>
>>
>> My probability variable is "share", and "odo2" is my equivalent of
>> your "y". All I did was:
>>
>> mata
>>
>> p = st_data((1..10)', "share")
>>
>> p :/ sum(p)
>>
>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p))       [this is where
>> the error occurs]
>>
>>
>> My apologies for coming back with the same question again.
>>
>>
>> Best Regards,
>> Lulu
>>
>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>> Here is an example of using -rdiscrete()- in Mata. In your case, the
>>> probabilities are already in a variable. If -rdiscrete()- chokes on
>>> small differences in total from 1, then check the probabilities and if
>>> need be scale by -p :/ sum(p)-.
>>>
>>> . clear
>>>
>>> . set obs 1000
>>> obs was 0, now 1000
>>>
>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05]
>>>
>>> . gen double p = p[1,_n]
>>> (990 missing values generated)
>>>
>>> . list in 1/10, sep(0)
>>>
>>>      +-----+
>>>      |   p |
>>>      |-----|
>>>   1. |  .2 |
>>>   2. |  .2 |
>>>   3. |  .1 |
>>>   4. |  .1 |
>>>   5. |  .1 |
>>>   6. |  .1 |
>>>   7. | .05 |
>>>   8. | .05 |
>>>   9. | .05 |
>>>  10. | .05 |
>>>      +-----+
>>>
>>> . gen y = .
>>> (1000 missing values generated)
>>>
>>> . mata
>>> ------------------------------------------------- mata (type end to
>>> exit) ------------------
>>> : p = st_data((1..10)', "p")
>>>
>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p))
>>>
>>> : end
>>> --------------------------------------------------------------------------------------------
>>>
>>> . tab y
>>>
>>>           y |      Freq.     Percent        Cum.
>>> ------------+-----------------------------------
>>>           1 |        202       20.20       20.20
>>>           2 |        200       20.00       40.20
>>>           3 |         98        9.80       50.00
>>>           4 |        102       10.20       60.20
>>>           5 |         87        8.70       68.90
>>>           6 |         99        9.90       78.80
>>>           7 |         49        4.90       83.70
>>>           8 |         54        5.40       89.10
>>>           9 |         53        5.30       94.40
>>>          10 |         56        5.60      100.00
>>> ------------+-----------------------------------
>>>       Total |      1,000      100.00
>>> Nick
>>> njcoxstata@gmail.com
>>>
>>>
>>> On 18 February 2014 09:35, Nick Cox <njcoxstata@gmail.com> wrote:
>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in
>>>> effect your sample would just be the observation numbers.
>>>> Nick
>>>> njcoxstata@gmail.com
>>>>
>>>>
>>>> On 18 February 2014 09:32, Nick Cox <njcoxstata@gmail.com> wrote:
>>>>> Thanks for the details.
>>>>>
>>>>> The Mata function -rdiscrete()- should do most of whar you want. You
>>>>> will need to map your values to integers 1 up and then read in the
>>>>> probabilities so that they are copied from a variable to a vector in
>>>>> Mata. Then select integers and reverse the mapping.
>>>>>
>>>>> Nick
>>>>> njcoxstata@gmail.com
>>>>>
>>>>>
>>>>> On 18 February 2014 09:17, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>>> Dear Nick,
>>>>>>
>>>>>> My apologies for the unclear description.
>>>>>>
>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known,
>>>>>> discrete values I want to draw; the other holds the corresponding
>>>>>> probabilities.
>>>>>>
>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a
>>>>>> random utility model. I am trying to draw from the distribution of
>>>>>> this parameter of interest, and then divide it by the price parameter
>>>>>> (which similarly has 2 associated variables too) to obtain a
>>>>>> distribution of willingness to pay.
>>>>>>
>>>>>>
>>>>>> Best Regards,
>>>>>> Lulu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>>>>>> You have not, so far as I can see, specified
>>>>>>>
>>>>>>> 1. How you are holding information on your distribution. Is it 1200
>>>>>>> known values with associated probabilities (so as two variables in
>>>>>>> Stata), or is the information still outside Stata in some form?
>>>>>>>
>>>>>>> 2. What you expect to draw as a sample.
>>>>>>> Nick
>>>>>>> njcoxstata@gmail.com
>>>>>>>
>>>>>>>
>>>>>>> On 18 February 2014 03:58, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>>>>> Dear Scott,
>>>>>>>>
>>>>>>>> Thank you for your response. My apologies that I am still a little
>>>>>>>> confused about how to do this in my case where I have 1,200
>>>>>>>> observation. Can I still use the cond() command without typing in each
>>>>>>>> point of the draw?
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Lulu
>>>>>>>>
>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman
>>>>>>>> <scott.merryman@gmail.com> wrote:
>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Scott
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>>>>>>> Dear Statalist,
>>>>>>>>>>
>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not
>>>>>>>>>> normal or lognormal etc), discrete distribution.
>>>>>>>>>>
>>>>>>>>>> For example, taking draws from a distribution like the one below.
>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given
>>>>>>>>>> in the example.
>>>>>>>>>>
>>>>>>>>>> Draw value     Probability
>>>>>>>>>>
>>>>>>>>>>     0.5                0.15
>>>>>>>>>>
>>>>>>>>>>     0.6                0.30
>>>>>>>>>>
>>>>>>>>>>     0.2                0.25
>>>>>>>>>>
>>>>>>>>>>     0.9                0.30
>>>>>>>>>>
>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance
>>>>>>>>>> each value be drawn, so it adds up to 1.
```