Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Drawing from a known, non-regular, discrete distribution

From	Lulu Zeng <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Drawing from a known, non-regular, discrete distribution
Date	Thu, 20 Feb 2014 21:05:06 +1100

Dear Nick,

Thank you so much for your reply.

The code works and seems to give me the draws I am looking for by
looking at the range.

But I have trouble understanding the last line of the code (around
what the square brackets do): gen odo2 = odo[indices]

I understand it generates a new variable using the original value and
the draws, but not quite sure what it exactly does. I tried to look up
the function of the square brackets but didn't find anything on the
internet.

Could you please explain the function of the square brackets please?

Thank you for your consideration.

Best Regards,
Lulu





On Wed, Feb 19, 2014 at 11:48 PM, Nick Cox <[email protected]> wrote:
> Something like this?
>
> gen indices = .
> mata
> share = st_data(., "share")
> share = share :/ sum(share)
> y = rdiscrete(1000, 1, share)
> st_store((1..1000)', "indices", y)
> end
> gen odo2 = odo[indices]
> Nick
> [email protected]
>
>
> On 19 February 2014 09:20, Lulu Zeng <[email protected]> wrote:
>> Dear Nick and others,
>>
>> I have 1200 observations in my dataset.
>>
>> 1200 observations (of variable "share") define the probabilities (add
>> up to 1) & 1200 pre-defined corresponding values to be drawn from
>> (saved in variable "odo").
>>
>> I am thinking of having 1000 draws in my sample.
>>
>> My data looks like below (but with more points). Draw value is
>> pre-defined, each of them has a probability attached.
>>
>>  Draw value     Probability
>>
>>      0.5                0.15
>>
>>      0.6                0.30
>>
>>      0.2                0.25
>>
>>      0.9                0.30
>>
>> Thank you for your consideration :)
>>
>>
>> Best Regards,
>> Lulu
>>
>> On Wed, Feb 19, 2014 at 7:59 PM, Nick Cox <[email protected]> wrote:
>>> My own thoughts on "Thanks in advance" are codified in the FAQ.
>>> Seemingly no-one agrees with me.
>>>
>>> I will pose some questions here, but given other commitments I won't
>>> be able to respond to any answers until _much_ later today, local
>>> time. If someone else picks this up before then, fine by me,
>>> naturally!
>>>
>>> How many observations are in your dataset?
>>> How many observations define the probabilities?
>>> How many values do you want in your sample?
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>>
>>> On 19 February 2014 08:51, Lulu Zeng <[email protected]> wrote:
>>>> Dear Nick,
>>>>
>>>> Sorry that the (1..10)' in my example was a typo, I in fact used 1200
>>>> instead of 10 in my real experiment. It didn't work despite so. I also
>>>> scaled "share" before calling meta, same error occurs.
>>>>
>>>> Also, by using -rdiscrete()-, I can see it draws a random number
>>>> according to a distribution specified by "p" (and write the random
>>>> draws into "odo2" using -st_store()- in my case), but I don't
>>>> understand how -rdiscrete()- could draw from a given set of values
>>>> (e.g., a pre-specified "odo2" -- this is really what I'm trying to do)
>>>> instead of random values.
>>>>
>>>> My apologies if the answer to my question is straight forward, I am
>>>> quite new to Meta.
>>>>
>>>> Thank you very much for your help in advance Nick.
>>>>
>>>> Best Regards,
>>>> Lulu
>>>>
>>>>
>>>>
>>>> On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <[email protected]> wrote:
>>>>> In my example, I have 10 probabilities in observations 1 to 10 of the
>>>>> data, so use
>>>>> (1..10)' as an argument. That will make sense for you if and only if
>>>>> your probabilities are  the same. See also help for -st_data()-.
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>>
>>>>> On 19 February 2014 00:09, Lulu Zeng <[email protected]> wrote:
>>>>>> Dear Nick,
>>>>>>
>>>>>> Thank you for your suggestion. I must have done something incorrectly
>>>>>> so mata still gives me the below error despite I did use -p :/ sum(p)-
>>>>>> for rescaling as you suggested (I also tried to rescale the original
>>>>>> probability variable but neither worked):
>>>>>>
>>>>>> sum of the probabilities must be 1
>>>>>>              rdiscrete():  3300  argument out of range
>>>>>>                  <istmt>:     -  function returned error
>>>>>> r(3300);
>>>>>>
>>>>>>
>>>>>> My probability variable is "share", and "odo2" is my equivalent of
>>>>>> your "y". All I did was:
>>>>>>
>>>>>> mata
>>>>>>
>>>>>> p = st_data((1..10)', "share")
>>>>>>
>>>>>> p :/ sum(p)
>>>>>>
>>>>>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p))       [this is where
>>>>>> the error occurs]
>>>>>>
>>>>>>
>>>>>> My apologies for coming back with the same question again.
>>>>>>
>>>>>>
>>>>>> Best Regards,
>>>>>> Lulu
>>>>>>
>>>>>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <[email protected]> wrote:
>>>>>>> Here is an example of using -rdiscrete()- in Mata. In your case, the
>>>>>>> probabilities are already in a variable. If -rdiscrete()- chokes on
>>>>>>> small differences in total from 1, then check the probabilities and if
>>>>>>> need be scale by -p :/ sum(p)-.
>>>>>>>
>>>>>>> . clear
>>>>>>>
>>>>>>> . set obs 1000
>>>>>>> obs was 0, now 1000
>>>>>>>
>>>>>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05]
>>>>>>>
>>>>>>> . gen double p = p[1,_n]
>>>>>>> (990 missing values generated)
>>>>>>>
>>>>>>> . list in 1/10, sep(0)
>>>>>>>
>>>>>>>      +-----+
>>>>>>>      |   p |
>>>>>>>      |-----|
>>>>>>>   1. |  .2 |
>>>>>>>   2. |  .2 |
>>>>>>>   3. |  .1 |
>>>>>>>   4. |  .1 |
>>>>>>>   5. |  .1 |
>>>>>>>   6. |  .1 |
>>>>>>>   7. | .05 |
>>>>>>>   8. | .05 |
>>>>>>>   9. | .05 |
>>>>>>>  10. | .05 |
>>>>>>>      +-----+
>>>>>>>
>>>>>>> . gen y = .
>>>>>>> (1000 missing values generated)
>>>>>>>
>>>>>>> . mata
>>>>>>> ------------------------------------------------- mata (type end to
>>>>>>> exit) ------------------
>>>>>>> : p = st_data((1..10)', "p")
>>>>>>>
>>>>>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p))
>>>>>>>
>>>>>>> : end
>>>>>>> --------------------------------------------------------------------------------------------
>>>>>>>
>>>>>>> . tab y
>>>>>>>
>>>>>>>           y |      Freq.     Percent        Cum.
>>>>>>> ------------+-----------------------------------
>>>>>>>           1 |        202       20.20       20.20
>>>>>>>           2 |        200       20.00       40.20
>>>>>>>           3 |         98        9.80       50.00
>>>>>>>           4 |        102       10.20       60.20
>>>>>>>           5 |         87        8.70       68.90
>>>>>>>           6 |         99        9.90       78.80
>>>>>>>           7 |         49        4.90       83.70
>>>>>>>           8 |         54        5.40       89.10
>>>>>>>           9 |         53        5.30       94.40
>>>>>>>          10 |         56        5.60      100.00
>>>>>>> ------------+-----------------------------------
>>>>>>>       Total |      1,000      100.00
>>>>>>> Nick
>>>>>>> [email protected]
>>>>>>>
>>>>>>>
>>>>>>> On 18 February 2014 09:35, Nick Cox <[email protected]> wrote:
>>>>>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in
>>>>>>>> effect your sample would just be the observation numbers.
>>>>>>>> Nick
>>>>>>>> [email protected]
>>>>>>>>
>>>>>>>>
>>>>>>>> On 18 February 2014 09:32, Nick Cox <[email protected]> wrote:
>>>>>>>>> Thanks for the details.
>>>>>>>>>
>>>>>>>>> The Mata function -rdiscrete()- should do most of whar you want. You
>>>>>>>>> will need to map your values to integers 1 up and then read in the
>>>>>>>>> probabilities so that they are copied from a variable to a vector in
>>>>>>>>> Mata. Then select integers and reverse the mapping.
>>>>>>>>>
>>>>>>>>> Nick
>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 18 February 2014 09:17, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>> Dear Nick,
>>>>>>>>>>
>>>>>>>>>> My apologies for the unclear description.
>>>>>>>>>>
>>>>>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known,
>>>>>>>>>> discrete values I want to draw; the other holds the corresponding
>>>>>>>>>> probabilities.
>>>>>>>>>>
>>>>>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a
>>>>>>>>>> random utility model. I am trying to draw from the distribution of
>>>>>>>>>> this parameter of interest, and then divide it by the price parameter
>>>>>>>>>> (which similarly has 2 associated variables too) to obtain a
>>>>>>>>>> distribution of willingness to pay.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>> Lulu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <[email protected]> wrote:
>>>>>>>>>>> You have not, so far as I can see, specified
>>>>>>>>>>>
>>>>>>>>>>> 1. How you are holding information on your distribution. Is it 1200
>>>>>>>>>>> known values with associated probabilities (so as two variables in
>>>>>>>>>>> Stata), or is the information still outside Stata in some form?
>>>>>>>>>>>
>>>>>>>>>>> 2. What you expect to draw as a sample.
>>>>>>>>>>> Nick
>>>>>>>>>>> [email protected]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 18 February 2014 03:58, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>> Dear Scott,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your response. My apologies that I am still a little
>>>>>>>>>>>> confused about how to do this in my case where I have 1,200
>>>>>>>>>>>> observation. Can I still use the cond() command without typing in each
>>>>>>>>>>>> point of the draw?
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Lulu
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman
>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> and the links within.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Scott
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>>>> Dear Statalist,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not
>>>>>>>>>>>>>> normal or lognormal etc), discrete distribution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For example, taking draws from a distribution like the one below.
>>>>>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given
>>>>>>>>>>>>>> in the example.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Draw value     Probability
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     0.5                0.15
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     0.6                0.30
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     0.2                0.25
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     0.9                0.30
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance
>>>>>>>>>>>>>> each value be drawn, so it adds up to 1.
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>> *
>>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>> *
>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>> *
>>>>>>>>>> *   For searches and help try:
>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>

References:
- st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Scott Merryman <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Lulu Zeng <[email protected]>
- Re: st: Drawing from a known, non-regular, discrete distribution
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Recreating SAS "sums of squares" in Stata using anova and regress
Next by Date: Re: st: About taking log on zero values
Previous by thread: Re: st: Drawing from a known, non-regular, discrete distribution
Next by thread: Re: st: Drawing from a known, non-regular, discrete distribution
Index(es):
- Date
- Thread