Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: using loop to generate distributions with different means and standard deviations


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: using loop to generate distributions with different means and standard deviations
Date   Sun, 22 May 2011 12:41:42 +0100

Your earlier emails left fairly unclear to me what you were trying to do, as Sarah also pointed out. Now that you have a solution, I can see how to improve it or to do it differently. 

1. You are using -drawnorm- to produce a single variable which is normally distributed. That clearly works fine, but you don't need the generality of -drawnorm-. You could do it directly using -rnormal()-. 

2. -mkmat- to get a single-element matrix out of a single observation in a single variable similarly is more general than you need. You could just go (e.g.) 

matrix m = mean_item_1[1] 

3. Even with -drawnorm- you don't need to put single numbers into a matrix; you can call -drawnorm- with constants. 

drawnorm foo, mean(0) sd(1)

4. ... if `i' == _n 

is better as 

... in `i' 

The first obliges Stata to look at each observation. Suppose `i' evaluates to 1. Then Stata goes 

Obs 1: _n is 1: is `i' equal to 1: yes: apply command to this observation 
Obs 2: _n is 2: is `i' equal to 2: no: apply command to this observation

And so on through the data. When you know that only a single observation will be used, -in- is the way to do that directly. 

5. Your code could just be 

forvalues i=1/3 {
	local mean = mean_item_`i'[`i'] 
	local sd = sd_item_`i'[`i'] 
	drawnorm product`i'_dist, m(`m') sd(`sd')
}

Or indeed 

forvalues i=1/3 {
	drawnorm product`i'_dist, m(`= mean_item_`i'[`i']') sd(`= sd_item_`i'[`i']')
}

-- although I am happy to acknowledge that the syntax of the last looks unfriendly. 

6. For just three result variables, another way to do it with the intermediary of mean and sd variables is 

local means 3.14159 2.71828 42
local sds   1       2       3 

forval i = 1/3 { 
	local mean : word `i' of `means' 
	local sd : word `i' of `sd'
	gen product`i'_dist = rnormal(`mean', `sd') 
}

Nick 
[email protected] 

Lance Wiggains

I was able to piece the code together. It is true that when generating
multiple drawnorms in a loop the means and standard deviations have to
be specified in matrix form. This is the code that I found to work.

forvalues i=1/3 {
	mkmat mean_item_`i' if `i'==_n, matrix(m)
	mkmat sd_item_`i' if `i'==_n, matrix(sd)
	drawnorm product`i'_dist, m(m) sd(sd)
}

On Fri, May 20, 2011 at 11:47 PM, Lance Wiggains

> I tried the command you suggested
>
> ds mean* sd*
> forvalues i=1/3 {
>       drawnorm product`i'_dist, m(mean_item_`i') sd(sd_item_`i')
> }
>
> but it failed, I got this error
>
> means(mean_item_1) invalid
> r(198);
>
> end of do-file
>
> r(198);
>
> I think the variable might need to be in matrix form. Does that sound
> right? If so, do you know how would I create the vector with the
> relevant information.
>
>
> On Fri, May 20, 2011 at 2:44 PM, Sarah Edgington <[email protected]> wrote:

>> Is your actual data really different from the simulated data you've shown?
>> If not I don't understand why the solution I suggested before doesn't solve
>> your problem. If your variables are actually numbered like you've shown,
>> it's still just a -forvalues- loop to get the drawnorm part working.
>>
>> forv i=1/3 {
>>        drawnorm product`i', m(max_product`i') sd(sd_product`i')
>> }
>>
>> Unless you give more specific information about the actual problem you're
>> trying to solve and why the suggested solution doesn't work, I don't think
>> you're going to get much help.

Lance Wiggains

>> Sorry for the vagueness. Right now I'm just using simulated data for 3
>> different products. Here is my code:
>> My data looks like this
>> Week      Product 1        Product 2      Product 3
>> 1                  50                 45                 50
>> 2                  60                 50                 40
>> 3                  70                 55                 30
>> 4                  80                 50                 20
>> 5                  90                 45                 10
>> 6                  100               50                  0
>>
>> tsset week
>> gen n=_n
>> egen max_n=max(n)
>>
>> ds week n max_n, not
>> foreach var in `r(varlist)'{
>>       tssmooth ma ms_`var'= `var', weights(1 1<2>1)
>>       }
>>
>> ds ms*
>> foreach var in `r(varlist)' {
>>       gen week3_`var'=`var' if n==max_n
>>       egen max_week3_`var'=max(week3_`var')
>>       drop week3*
>>
>> }
>> drop ms*
>>
>> ds week n max_*, not
>> foreach var in `r(varlist)' {
>>       gen max_`var'=max_week3_ms_`var'
>> }
>> drop max_week*
>>
>> keep if n+3>=max_n
>> ds week n max*, not
>> foreach var in `r(varlist)'{
>>       egen sd_`var'=sd(`var')
>> }
>>
>> rename max_n maximum_n
>>
>> ds max_* sd* week, not
>> foreach var in `r(varlist)'{
>>       drop `var'
>> }
>>
>> drawnorm product1, m(max_product1) sd(sd_product1)


On Wed, May 18, 2011 at 1:51 PM, Sarah Edgington <[email protected]> wrote:

>>> That's a different problem.  From your original post I assumed you had
>>> all the variables already created.
>>> One strategy for writing loops is to write out the code for the first
>>> two examples of something repetitive you want to do.  Then identify
>>> the parts of the example that remain the same across the examples.
>>> If you post the code your trying to repeat we may be able to help you
>>> but your current description is too vague for me to do much more than
>>> offer vague suggestions of how to think about loops.

Lance Wiggains

>>> I've tried that but the problem is that I'm pre-calculating the means
>>> and sd's for the variable because I'm only using the last 3-4
>>> observations for each variable to calculate those values. I'm doing
>>> this because I want it to reflect the changes that happen recently. My
>>> mean function uses tssmooth, with weights (1 1<2>), to average the
>>> last 3 weeks of sales. So if sales were 70,80,90, and 100 I get a
>>> value of 92.5 for my mean. It also calculates a SD for the last 3-4
>>> observations. Then I want to plug those numbers into the drawnorm function
>> using a loop. Any idea about how that would work?


On Wed, May 18, 2011 at 1:16 PM, Sarah Edgington <[email protected]> wrote:

>>>> Try something like this:
>>>>
>>>>        forv i=1/3 {
>>>>                 drawnorm name`i', m(mean_var`i') sd(sd_var`i')
>>>>        }
>>>>
>>>> You'll run into problems, though, if your data actually includes the
>>>> variable names you list since there isn't a sd_var1.

Lance Wiggains

>>>> I'm trying to get Stata to generate a distribution of data from
>>>> variables in my data set.
>>>>
>>>> My appended data looks like this
>>>> mean_var1=90
>>>> standard_deviation_var1=5
>>>> mean_var2=100
>>>> sd_var2=10
>>>> mean var3=110
>>>> sd_var3=15
>>>> and so on
>>>>
>>>> I'm need a loop that will take my variables and create the
>>>> distributions for me.
>>>> I've been using the drawnorm command
>>>>     drawnorm name1, m(mean_var1) sd(sd_var1) but I can't get it to
>>>> recognize more than 1 variable at a time
>>>>
>>>> I want it to perform the distribution command for each pair of my
>>> variables.
>>>> I.e. (m_var1, sd_var1), (m_var2, sd_var2) , (m_var3, sd_var3)


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index