Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Looping Enquiry

 From Robert Picard To statalist@hsphsun2.harvard.edu Subject Re: st: Looping Enquiry Date Thu, 29 Sep 2011 15:20:20 -0400

```Good one Richard, I would not have noticed that the cumulative product
collapses to step1 / 48. The results are the same when the running
product is calculated. Not quite matching RESULT though.

*----------- begin example -------------
clear
input    time   group   var  RESULT
30       1        0  -0.958
57       0        0  -0.916
58       0        0  -0.834
67       0        0  -0.874
74       0        0  -0.792
79       0        0  -0.750
79       1        1   0.125
82       1        1   0.125
89       0        0  -0.706
95       1        0  -0.662
98       0        0  -0.678
101       0        0  -0.574
104       0        0  -0.532
110       0        0  -0.448
118       0        0  -0.444
end

* Richard's solution
generate int step1 = 48 - sum(1 - var)
generate double step4 = cond(var == 1, 1 - step1 / 48, 1 - 2*step1 / 48)

* Initialize the running product variable to 1 when var == 1
gen double runprod = cond(var,1,step1/(step1 + 1))
replace runprod = runprod * runprod[_n-1] if _n > 1
gen myres = 1 - cond(var,1,2) * runprod

format %9.3g step4 myres
list step1 RESULT step4 myres, clean noobs
*------------ end example --------------

On Thu, Sep 29, 2011 at 1:27 PM, Richard Herron
<richard.c.herron@gmail.com> wrote:
> I _did_ miunderstand your question. I hope I didn't send you on too
> much of a detour.
>
> I think you want an if-else statement, like -cond-. It looks like your
> cumulative product collapses to step1 / 48, so the only remaining
> trick is to generate step1, which you can do with -sum-.
>
> I hope this is closer (I can't tell if you have a few typos, or if I
> am still missing something).
>
> * ----- begin code -----
> clear
> input    time   group   var  RESULT
>          30       1        0  -0.958
>          57       0        0  -0.916
>          58       0        0  -0.834
>          67       0        0  -0.874
>          74       0        0  -0.792
>          79       0        0  -0.750
>          79       1        1   0.125
>          82       1        1   0.125
>          89       0        0  -0.706
>          95       1        0  -0.662
>          98       0        0  -0.678
>         101       0        0  -0.574
>         104       0        0  -0.532
>         110       0        0  -0.448
>         118       0        0  -0.444
> end
>
> generate int step1 = 48 - sum(1 - var)
> generate double step4 = cond(var == 1, 1 - step1 / 48, 1 - 2*step1 / 48)
> * ----- end code -----
>
>
> On Thu, Sep 29, 2011 at 12:49, George Bouliotis <g.bouliotis@bham.ac.uk> wrote:
>> Dear Robert, Nick, Valerie and Richard
>>
>> Thank you for your response to my question and your fruitful comments. Yes, my problem is programming the running-product estimation.
>>
>> I have already been through different approaches and ideas (e.g. egen with product function or sums) but nothing worked correctly. The "lagged" approach seems promising but the tricky point is when the (sub)products should be 1
>>
>> In an attempt to make things more clear, I provide a more clear dataset here:
>>
>>   time   group   indic   RESULT   step1   step2   step3  step4
>>      30       1     0    -.958      47      48   .9791667  1-2*(47/48) =-0.958
>>      57       0     0    -.916      46      47   .9787234  1-2*((46/47)*(47/48)) =-0.916
>>      58       0     0    -.874      45      46   .9782609  1-2*((45/46)*(46/47)*(47/48)) =-0.874
>>      67       0     0    -.834      44      45   .9777778  1-2*((44/45)*(45/46)*(46/47)*(47/48)) =-0.834
>>      74       0     0    -.792      43      44   .9772727  1-2*((43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =-0.792
>>      79       0     0     -.75      42      43   .9767442  1-2*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =-0.750
>>      79       1     1     .125      42      43   .9767442  1-1*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.125
>>      82       1     1     .125      42      43   .9767442  1-1*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.125
>>      89       0     0    -.708      41      42   .9761904  1-2*((41/42)*(42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.708
>>      95       1     0    -.662      40      41   .9756098  and so on...
>>      98       0     0    -.678      39      40       .975
>>     101       0     0    -.574      38      39    .974359
>>     104       0     0    -.532      37      38   .9736842
>>     110       0     0    -.448      36      37    .972973
>>     118       0     0    -.444      35      36   .9722222
>>
>> The task is to generate correctly the "step4" variable which varies conditional upon the value of the dummy variable "indic". If indic is 0 then ste4 is calculated as 1-2*(products) and if indic is 0 then step for is estimated as 1-1*(products). The variable (column) RESULT simply illustrates what the correct values should be for the variable "step4".
>>
>> Thank you very much
>> George
>>
>>
>>
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Robert Picard
>> Sent: 29 September 2011 17:40
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Looping Enquiry
>>
>> It looks like George is stuck on calculating a running product,
>> similar to the sum() function. This can easily be done without a loop:
>>
>> gen rp = _n
>> replace rp = rp * rp[_n-1] if _n > 1
>>
>> Since George is trying to replicate the RESULT variable and he's not
>> using the time and group variables, I suspect that Richard's solution
>> is not what George is looking for. While I can't quite figure out
>> everything, here's an attempt that get close:
>>
>> *----------- begin example -------------
>> clear
>> input    time   group   var  RESULT
>>          30       1        0  -0.958
>>          57       0        0  -0.916
>>          58       0        0  -0.834
>>          67       0        0  -0.874
>>          74       0        0  -0.792
>>          79       0        0  -0.750
>>          79       1        1   0.125
>>          82       1        1   0.125
>>          89       0        0  -0.706
>>          95       1        0  -0.662
>>          98       0        0  -0.678
>>         101       0        0  -0.574
>>         104       0        0  -0.532
>>         110       0        0  -0.448
>>         118       0        0  -0.444
>> end
>>
>> * setup, per OP
>> egen ssize=seq() if var==0,from(47) to(1)
>> replace ssize=ssize[_n-1] if ssize==.
>> gen  product= (ssize/(ssize+1))
>>
>> * runing product that does not change when var == 1
>> gen double rprod = cond(var,1,product)
>> replace rprod = rprod * rprod[_n-1] if _n > 1
>>
>> * replicate RESULT
>> gen score = 1 - cond(var,1,2) * rprod
>> format %9.3g score
>> list, clean noobs
>> *------------ end example --------------
>>
>>
>> On Thu, Sep 29, 2011 at 10:24 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
>>> This looks a neat solution, assuming that Richard has correctly understood the question.
>>>
>>> I'd add a thought that no harm would be done by putting the result of -generate- into a -double-. These numbers don't look problematic, but a little worry about loss of precision would do no harm.
>>>
>>> Nick
>>> n.j.cox@durham.ac.uk
>>>
>>> Richard Herron
>>>
>>> If I understand the question, I think you can do this without a loop.
>>> If you sort on group and time, then you can create a sequential time
>>> index, use -tsset-, and use lag operators to generate your product.
>>> Here's my attempt, please let me know if I got your question wrong.
>>>
>>> * begin code
>>> clear
>>> input    time   group   var  RESULT
>>>          30       1        0  -0.958
>>>          57       0        0  -0.916
>>>          58       0        0  -0.834
>>>          67       0        0  -0.874
>>>          74       0        0  -0.792
>>>          79       0        0  -0.750
>>>          79       1        1   0.125
>>>          82       1        1   0.125
>>>          89       0        0  -0.706
>>>          95       1        0  -0.662
>>>          98       0        0  -0.678
>>>         101       0        0  -0.574
>>>         104       0        0  -0.532
>>>         110       0        0  -0.448
>>>         118       0        0  -0.444
>>> end
>>>
>>> bysort group (time): generate time_seq = _n
>>> tsset group time_seq
>>> by group: generate observ = RESULT * l.RESULT * l2.RESULT * l3.RESULT
>>> * end code
>>>
>>> which produces
>>>
>>> . list, clean
>>>
>>>       time   group   var   RESULT   time_seq     observ
>>>  1.     57       0     0    -.916          1          .
>>>  2.     58       0     0    -.834          2          .
>>>  3.     67       0     0    -.874          3          .
>>>  4.     74       0     0    -.792          4   .5288082
>>>  5.     79       0     0     -.75          5   .4329761
>>>  6.     89       0     0    -.706          6   .3665241
>>>  7.     98       0     0    -.678          7   .2843288
>>>  8.    101       0     0    -.574          8   .2060666
>>>  9.    104       0     0    -.532          9   .1461699
>>>  10.    110       0     0    -.448         10   .0927537
>>>  11.    118       0     0    -.444         11   .0607414
>>>  12.     30       1     0    -.958          1          .
>>>  13.     79       1     1     .125          2          .
>>>  14.     82       1     1     .125          3          .
>>>  15.     95       1     0    -.662          4   .0099093
>>>
>>> .
>>>
>>> On Thu, Sep 29, 2011 at 09:22, George Bouliotis <g.bouliotis@bham.ac.uk> wrote:
>>>
>>>> Although an old Stata user, currently I am doing my first steps in programming.
>>>>
>>>> One of the parts in my programme tries (unsuccessfully) to replicate the column RESULT below. The difficulty is in how to loop a sequential product as, for instance: observ4= obs4 X obs3 (lag1) X obs2 (lag2)  X obs1 (lag1).
>>>>
>>>> I tried some loops with "forvalue" but none was successful. I would appreciate any help with this.
>>>
>>> [...]
>>>
>>>>
>>>> #####################################
>>>> set more off
>>>> clear
>>>> input    time   group   var  RESULT
>>>>           30       1        0  -0.958
>>>>           57       0        0  -0.916
>>>>           58       0        0  -0.834
>>>>           67       0        0  -0.874
>>>>           74       0        0  -0.792
>>>>           79       0        0  -0.750
>>>>           79       1        1   0.125
>>>>           82       1        1   0.125
>>>>           89       0        0  -0.706
>>>>           95       1        0  -0.662
>>>>           98       0        0  -0.678
>>>>          101       0        0  -0.574
>>>>          104       0        0  -0.532
>>>>          110       0        0  -0.448
>>>>          118       0        0  -0.444
>>>> end
>>>>
>>>>
>>>> list , clean
>>>>
>>>> //Generating Ssize variable
>>>> egen ssize=seq() if var==0,from(47) to(1)
>>>> replace ssize=ssize[_n-1] if ssize==.
>>>> list, noobs clean
>>>>
>>>>
>>>> //Generating product variable
>>>> gen  product= (ssize/(ssize+1))
>>>>
>>>>
>>>> //Generating Score variable (PRODUCT)
>>>> gen  score= 1-(2*product) in 1/1 if var==0
>>>> // for the first observation only
>>>>
>>>> //**REPLACEMENT A: when var==0
>>>> replace  score= 1-(2*(product*product[_n-1]))  if var==0 & score==.
>>>> // fine for the second obs only  (correct formula for when var=0)
>>>>
>>>> //**REPLACEMENT B: when var==1
>>>> replace  i1= 1-1*(score*score[_n-1])  if var==1
>>>> // fine for the second observ only (correct formula for when var=1)
>>>> // but instead of [_n-1] I need a loop for [_n-`n(lagged)'] with "forvalue" command?
>>>>
>>>> list, clean noobs
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```