Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Drawing from a known, non-regular, discrete distribution


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Drawing from a known, non-regular, discrete distribution
Date   Thu, 20 Feb 2014 10:24:38 +0000

It's just subscripting.

sysuse auto
di mpg[1]
list in 1

Subscripts are observation numbers.

You should be familiar with the idea that subscripts can be
expressions. A common example is

gen previous = value[_n-1]

With an expression such as _n - 1 Stata works that out, observation by
observation. If _n is 1, _n - 1 = 0. value[0] is always treated as
missing. More straiightforwardly, if _n is 2, _n - 1 is 1, and so
forth.

An expression can (easily) be a single variable.

gen foo = varname[indices]

just means

foo[1] is varname[indices[1]]
foo[2] is varname[indices[2]]

etc.

Suppose

indices   varname
3               10
1               20
2               30

then if foo is varname[indices], foo[1] is varname[indices[1]], namely
varname[3], namely 30.

One variable serves as a look-up table. That's another terminology.

Nick
[email protected]


On 20 February 2014 10:05, Lulu Zeng <[email protected]> wrote:
> Dear Nick,
>
> Thank you so much for your reply.
>
> The code works and seems to give me the draws I am looking for by
> looking at the range.
>
> But I have trouble understanding the last line of the code (around
> what the square brackets do): gen odo2 = odo[indices]
>
> I understand it generates a new variable using the original value and
> the draws, but not quite sure what it exactly does. I tried to look up
> the function of the square brackets but didn't find anything on the
> internet.
>
> Could you please explain the function of the square brackets please?
>
> Thank you for your consideration.
>
> Best Regards,
> Lulu
>
>
>
>
>
> On Wed, Feb 19, 2014 at 11:48 PM, Nick Cox <[email protected]> wrote:
>> Something like this?
>>
>> gen indices = .
>> mata
>> share = st_data(., "share")
>> share = share :/ sum(share)
>> y = rdiscrete(1000, 1, share)
>> st_store((1..1000)', "indices", y)
>> end
>> gen odo2 = odo[indices]
>> Nick
>> [email protected]
>>
>>
>> On 19 February 2014 09:20, Lulu Zeng <[email protected]> wrote:
>>> Dear Nick and others,
>>>
>>> I have 1200 observations in my dataset.
>>>
>>> 1200 observations (of variable "share") define the probabilities (add
>>> up to 1) & 1200 pre-defined corresponding values to be drawn from
>>> (saved in variable "odo").
>>>
>>> I am thinking of having 1000 draws in my sample.
>>>
>>> My data looks like below (but with more points). Draw value is
>>> pre-defined, each of them has a probability attached.
>>>
>>>  Draw value     Probability
>>>
>>>      0.5                0.15
>>>
>>>      0.6                0.30
>>>
>>>      0.2                0.25
>>>
>>>      0.9                0.30
>>>
>>> Thank you for your consideration :)
>>>
>>>
>>> Best Regards,
>>> Lulu
>>>
>>> On Wed, Feb 19, 2014 at 7:59 PM, Nick Cox <[email protected]> wrote:
>>>> My own thoughts on "Thanks in advance" are codified in the FAQ.
>>>> Seemingly no-one agrees with me.
>>>>
>>>> I will pose some questions here, but given other commitments I won't
>>>> be able to respond to any answers until _much_ later today, local
>>>> time. If someone else picks this up before then, fine by me,
>>>> naturally!
>>>>
>>>> How many observations are in your dataset?
>>>> How many observations define the probabilities?
>>>> How many values do you want in your sample?
>>>>
>>>> Nick
>>>> [email protected]
>>>>
>>>>
>>>>
>>>> On 19 February 2014 08:51, Lulu Zeng <[email protected]> wrote:
>>>>> Dear Nick,
>>>>>
>>>>> Sorry that the (1..10)' in my example was a typo, I in fact used 1200
>>>>> instead of 10 in my real experiment. It didn't work despite so. I also
>>>>> scaled "share" before calling meta, same error occurs.
>>>>>
>>>>> Also, by using -rdiscrete()-, I can see it draws a random number
>>>>> according to a distribution specified by "p" (and write the random
>>>>> draws into "odo2" using -st_store()- in my case), but I don't
>>>>> understand how -rdiscrete()- could draw from a given set of values
>>>>> (e.g., a pre-specified "odo2" -- this is really what I'm trying to do)
>>>>> instead of random values.
>>>>>
>>>>> My apologies if the answer to my question is straight forward, I am
>>>>> quite new to Meta.
>>>>>
>>>>> Thank you very much for your help in advance Nick.
>>>>>
>>>>> Best Regards,
>>>>> Lulu
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <[email protected]> wrote:
>>>>>> In my example, I have 10 probabilities in observations 1 to 10 of the
>>>>>> data, so use
>>>>>> (1..10)' as an argument. That will make sense for you if and only if
>>>>>> your probabilities are  the same. See also help for -st_data()-.
>>>>>> Nick
>>>>>> [email protected]
>>>>>>
>>>>>>
>>>>>> On 19 February 2014 00:09, Lulu Zeng <[email protected]> wrote:
>>>>>>> Dear Nick,
>>>>>>>
>>>>>>> Thank you for your suggestion. I must have done something incorrectly
>>>>>>> so mata still gives me the below error despite I did use -p :/ sum(p)-
>>>>>>> for rescaling as you suggested (I also tried to rescale the original
>>>>>>> probability variable but neither worked):
>>>>>>>
>>>>>>> sum of the probabilities must be 1
>>>>>>>              rdiscrete():  3300  argument out of range
>>>>>>>                  <istmt>:     -  function returned error
>>>>>>> r(3300);
>>>>>>>
>>>>>>>
>>>>>>> My probability variable is "share", and "odo2" is my equivalent of
>>>>>>> your "y". All I did was:
>>>>>>>
>>>>>>> mata
>>>>>>>
>>>>>>> p = st_data((1..10)', "share")
>>>>>>>
>>>>>>> p :/ sum(p)
>>>>>>>
>>>>>>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p))       [this is where
>>>>>>> the error occurs]
>>>>>>>
>>>>>>>
>>>>>>> My apologies for coming back with the same question again.
>>>>>>>
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Lulu
>>>>>>>
>>>>>>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <[email protected]> wrote:
>>>>>>>> Here is an example of using -rdiscrete()- in Mata. In your case, the
>>>>>>>> probabilities are already in a variable. If -rdiscrete()- chokes on
>>>>>>>> small differences in total from 1, then check the probabilities and if
>>>>>>>> need be scale by -p :/ sum(p)-.
>>>>>>>>
>>>>>>>> . clear
>>>>>>>>
>>>>>>>> . set obs 1000
>>>>>>>> obs was 0, now 1000
>>>>>>>>
>>>>>>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05]
>>>>>>>>
>>>>>>>> . gen double p = p[1,_n]
>>>>>>>> (990 missing values generated)
>>>>>>>>
>>>>>>>> . list in 1/10, sep(0)
>>>>>>>>
>>>>>>>>      +-----+
>>>>>>>>      |   p |
>>>>>>>>      |-----|
>>>>>>>>   1. |  .2 |
>>>>>>>>   2. |  .2 |
>>>>>>>>   3. |  .1 |
>>>>>>>>   4. |  .1 |
>>>>>>>>   5. |  .1 |
>>>>>>>>   6. |  .1 |
>>>>>>>>   7. | .05 |
>>>>>>>>   8. | .05 |
>>>>>>>>   9. | .05 |
>>>>>>>>  10. | .05 |
>>>>>>>>      +-----+
>>>>>>>>
>>>>>>>> . gen y = .
>>>>>>>> (1000 missing values generated)
>>>>>>>>
>>>>>>>> . mata
>>>>>>>> ------------------------------------------------- mata (type end to
>>>>>>>> exit) ------------------
>>>>>>>> : p = st_data((1..10)', "p")
>>>>>>>>
>>>>>>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p))
>>>>>>>>
>>>>>>>> : end
>>>>>>>> --------------------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> . tab y
>>>>>>>>
>>>>>>>>           y |      Freq.     Percent        Cum.
>>>>>>>> ------------+-----------------------------------
>>>>>>>>           1 |        202       20.20       20.20
>>>>>>>>           2 |        200       20.00       40.20
>>>>>>>>           3 |         98        9.80       50.00
>>>>>>>>           4 |        102       10.20       60.20
>>>>>>>>           5 |         87        8.70       68.90
>>>>>>>>           6 |         99        9.90       78.80
>>>>>>>>           7 |         49        4.90       83.70
>>>>>>>>           8 |         54        5.40       89.10
>>>>>>>>           9 |         53        5.30       94.40
>>>>>>>>          10 |         56        5.60      100.00
>>>>>>>> ------------+-----------------------------------
>>>>>>>>       Total |      1,000      100.00
>>>>>>>> Nick
>>>>>>>> [email protected]
>>>>>>>>
>>>>>>>>
>>>>>>>> On 18 February 2014 09:35, Nick Cox <[email protected]> wrote:
>>>>>>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in
>>>>>>>>> effect your sample would just be the observation numbers.
>>>>>>>>> Nick
>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 18 February 2014 09:32, Nick Cox <[email protected]> wrote:
>>>>>>>>>> Thanks for the details.
>>>>>>>>>>
>>>>>>>>>> The Mata function -rdiscrete()- should do most of whar you want. You
>>>>>>>>>> will need to map your values to integers 1 up and then read in the
>>>>>>>>>> probabilities so that they are copied from a variable to a vector in
>>>>>>>>>> Mata. Then select integers and reverse the mapping.
>>>>>>>>>>
>>>>>>>>>> Nick
>>>>>>>>>> [email protected]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 18 February 2014 09:17, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>> Dear Nick,
>>>>>>>>>>>
>>>>>>>>>>> My apologies for the unclear description.
>>>>>>>>>>>
>>>>>>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known,
>>>>>>>>>>> discrete values I want to draw; the other holds the corresponding
>>>>>>>>>>> probabilities.
>>>>>>>>>>>
>>>>>>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a
>>>>>>>>>>> random utility model. I am trying to draw from the distribution of
>>>>>>>>>>> this parameter of interest, and then divide it by the price parameter
>>>>>>>>>>> (which similarly has 2 associated variables too) to obtain a
>>>>>>>>>>> distribution of willingness to pay.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Lulu
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <[email protected]> wrote:
>>>>>>>>>>>> You have not, so far as I can see, specified
>>>>>>>>>>>>
>>>>>>>>>>>> 1. How you are holding information on your distribution. Is it 1200
>>>>>>>>>>>> known values with associated probabilities (so as two variables in
>>>>>>>>>>>> Stata), or is the information still outside Stata in some form?
>>>>>>>>>>>>
>>>>>>>>>>>> 2. What you expect to draw as a sample.
>>>>>>>>>>>> Nick
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 18 February 2014 03:58, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>>> Dear Scott,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you for your response. My apologies that I am still a little
>>>>>>>>>>>>> confused about how to do this in my case where I have 1,200
>>>>>>>>>>>>> observation. Can I still use the cond() command without typing in each
>>>>>>>>>>>>> point of the draw?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>> Lulu
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and the links within.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Scott
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>>>>> Dear Statalist,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not
>>>>>>>>>>>>>>> normal or lognormal etc), discrete distribution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For example, taking draws from a distribution like the one below.
>>>>>>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given
>>>>>>>>>>>>>>> in the example.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Draw value     Probability
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     0.5                0.15
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     0.6                0.30
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     0.2                0.25
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     0.9                0.30
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance
>>>>>>>>>>>>>>> each value be drawn, so it adds up to 1.
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>> *
>>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>> *
>>>>>>>>>>> *   For searches and help try:
>>>>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>> *
>>>>>>>> *   For searches and help try:
>>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index