Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Drawing from a known, non-regular, discrete distribution

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: Drawing from a known, non-regular, discrete distribution Date Wed, 19 Feb 2014 00:54:27 +0000

In my example, I have 10 probabilities in observations 1 to 10 of the
data, so use
(1..10)' as an argument. That will make sense for you if and only if
Nick
njcoxstata@gmail.com

On 19 February 2014 00:09, Lulu Zeng <luluzengnz@gmail.com> wrote:
> Dear Nick,
>
> Thank you for your suggestion. I must have done something incorrectly
> so mata still gives me the below error despite I did use -p :/ sum(p)-
> for rescaling as you suggested (I also tried to rescale the original
> probability variable but neither worked):
>
> sum of the probabilities must be 1
>              rdiscrete():  3300  argument out of range
>                  <istmt>:     -  function returned error
> r(3300);
>
>
> My probability variable is "share", and "odo2" is my equivalent of
> your "y". All I did was:
>
> mata
>
> p = st_data((1..10)', "share")
>
> p :/ sum(p)
>
> st_store(., "odo2", rdiscrete(st_nobs(), 1, p))       [this is where
> the error occurs]
>
>
> My apologies for coming back with the same question again.
>
>
> Best Regards,
> Lulu
>
> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Here is an example of using -rdiscrete()- in Mata. In your case, the
>> probabilities are already in a variable. If -rdiscrete()- chokes on
>> small differences in total from 1, then check the probabilities and if
>> need be scale by -p :/ sum(p)-.
>>
>> . clear
>>
>> . set obs 1000
>> obs was 0, now 1000
>>
>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05]
>>
>> . gen double p = p[1,_n]
>> (990 missing values generated)
>>
>> . list in 1/10, sep(0)
>>
>>      +-----+
>>      |   p |
>>      |-----|
>>   1. |  .2 |
>>   2. |  .2 |
>>   3. |  .1 |
>>   4. |  .1 |
>>   5. |  .1 |
>>   6. |  .1 |
>>   7. | .05 |
>>   8. | .05 |
>>   9. | .05 |
>>  10. | .05 |
>>      +-----+
>>
>> . gen y = .
>> (1000 missing values generated)
>>
>> . mata
>> ------------------------------------------------- mata (type end to
>> exit) ------------------
>> : p = st_data((1..10)', "p")
>>
>> : st_store(., "y", rdiscrete(st_nobs(), 1, p))
>>
>> : end
>> --------------------------------------------------------------------------------------------
>>
>> . tab y
>>
>>           y |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>           1 |        202       20.20       20.20
>>           2 |        200       20.00       40.20
>>           3 |         98        9.80       50.00
>>           4 |        102       10.20       60.20
>>           5 |         87        8.70       68.90
>>           6 |         99        9.90       78.80
>>           7 |         49        4.90       83.70
>>           8 |         54        5.40       89.10
>>           9 |         53        5.30       94.40
>>          10 |         56        5.60      100.00
>> ------------+-----------------------------------
>>       Total |      1,000      100.00
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 18 February 2014 09:35, Nick Cox <njcoxstata@gmail.com> wrote:
>>> The "mapping" (if I am guessing correctly) is in fact trivial as in
>>> effect your sample would just be the observation numbers.
>>> Nick
>>> njcoxstata@gmail.com
>>>
>>>
>>> On 18 February 2014 09:32, Nick Cox <njcoxstata@gmail.com> wrote:
>>>> Thanks for the details.
>>>>
>>>> The Mata function -rdiscrete()- should do most of whar you want. You
>>>> will need to map your values to integers 1 up and then read in the
>>>> probabilities so that they are copied from a variable to a vector in
>>>> Mata. Then select integers and reverse the mapping.
>>>>
>>>> Nick
>>>> njcoxstata@gmail.com
>>>>
>>>>
>>>> On 18 February 2014 09:17, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>> Dear Nick,
>>>>>
>>>>> My apologies for the unclear description.
>>>>>
>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known,
>>>>> discrete values I want to draw; the other holds the corresponding
>>>>> probabilities.
>>>>>
>>>>> 2. The 2 variables are associated with a parameter (attribute) of a
>>>>> random utility model. I am trying to draw from the distribution of
>>>>> this parameter of interest, and then divide it by the price parameter
>>>>> (which similarly has 2 associated variables too) to obtain a
>>>>> distribution of willingness to pay.
>>>>>
>>>>>
>>>>> Best Regards,
>>>>> Lulu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>>>>> You have not, so far as I can see, specified
>>>>>>
>>>>>> 1. How you are holding information on your distribution. Is it 1200
>>>>>> known values with associated probabilities (so as two variables in
>>>>>> Stata), or is the information still outside Stata in some form?
>>>>>>
>>>>>> 2. What you expect to draw as a sample.
>>>>>> Nick
>>>>>> njcoxstata@gmail.com
>>>>>>
>>>>>>
>>>>>> On 18 February 2014 03:58, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>>>> Dear Scott,
>>>>>>>
>>>>>>> Thank you for your response. My apologies that I am still a little
>>>>>>> confused about how to do this in my case where I have 1,200
>>>>>>> observation. Can I still use the cond() command without typing in each
>>>>>>> point of the draw?
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Lulu
>>>>>>>
>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman
>>>>>>> <scott.merryman@gmail.com> wrote:
>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html
>>>>>>>>
>>>>>>>>
>>>>>>>> Scott
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <luluzengnz@gmail.com> wrote:
>>>>>>>>> Dear Statalist,
>>>>>>>>>
>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not
>>>>>>>>> normal or lognormal etc), discrete distribution.
>>>>>>>>>
>>>>>>>>> For example, taking draws from a distribution like the one below.
>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given
>>>>>>>>> in the example.
>>>>>>>>>
>>>>>>>>> Draw value     Probability
>>>>>>>>>
>>>>>>>>>     0.5                0.15
>>>>>>>>>
>>>>>>>>>     0.6                0.30
>>>>>>>>>
>>>>>>>>>     0.2                0.25
>>>>>>>>>
>>>>>>>>>     0.9                0.30
>>>>>>>>>
>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance
>>>>>>>>> each value be drawn, so it adds up to 1.
>>>>>>>> *
>>>>>>>> *   For searches and help try:
>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/