Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Looping across observations (forwards and backwards)


From   Pedro Nakashima <nakashimapedro@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Looping across observations (forwards and backwards)
Date   Wed, 9 Nov 2011 09:53:06 -0200

Actually, I did try.

In this case, bysort operations within groups seems too much unflexible.

I created a counter (in percentage) and it's showing me that the code
is actually doing it's job (consequently not doing an infinite loop),
although not so fast.

Later I will post comments about the results.

Best regards,
Pedro Nakashima.

2011/11/8 Nick Cox <njcoxstata@gmail.com>:
> Sorry, but I am going to back off from this. I've tried and failed to
> understand this twice, and I don't have the inclination to try again.
> Also, it does not seem that you have tried all my suggestions,
> although they were only guesses, so I don't feel obliged to try again.
>
> The real question for you is whether there is a completely different
> way for you to explain all this. These rules come from somewhere and
> it's possibly a context someone will recognise if you explain it
> afresh. But pushing harder at the same shut door is unlikely to get
> results. It would be nice if I were wrong about that.
>
> Nick
>
> On Tue, Nov 8, 2011 at 9:47 PM, Pedro Nakashima
> <nakashimapedro@gmail.com> wrote:
>> Thanks Nick, but it didn't work.
>>
>> Below I put a larger sample , a code that worked (for this small small
>> sample) and, at the end, a description of what I want to do.
>>
>> clear
>> input v269 v270 v271 ordem novaordem sinal
>>         1        1986          10          96         -96           .
>>         1        1988          50         148        -148           .
>>         1        1986         100         187        -187           .
>>         1        1986         100         513        -513           .
>>         1        1985          20         743        -743           .
>>         1        1985          40         944        -944           .
>>         1        1985          40         945        -945           .
>>         1        1988         100         954        -954           .
>>         2        1985          40         966        -966           1
>>         1        1986          40         971        -971           .
>>         1        1986          40         992        -992           .
>>         2        1985          20        1001       -1001           1
>>         0        1985          20        1019       -1019           .
>>         2        1985          20        1026       -1026          -1
>>         0        1985          40        1032       -1032           .
>>         1        1986         100        1034       -1034           .
>>         0        1985          40        1035       -1035           .
>>         0        1985          40        1045       -1045           .
>>         2        1986          10        1053       -1053           1
>>         0        1986          40        1054       -1054           .
>>         2        1986         100        1056       -1056           1
>>         2        1986          40        1062       -1062          -1
>>         2        1985          20        1064       -1064          -1
>>         2        1985          40        1065       -1065          -1
>>         1        1986          45        1068       -1068           .
>>         2        1986          45        1070       -1070           1
>>         2        1986         100        1074       -1074           1
>>         2        1988          10        1079       -1079           0
>>         2        1988         100        1081       -1081           1
>>         2        1988          50        1088       -1088           1
>>         0        1988          50        1091       -1091           .
>>         0        1988          50        1093       -1093           .
>>         2        1988          70        1094       -1094           0
>>         0        1988          50        1098       -1098           .
>>         2        1988          50        1099       -1099          -1
>>         0        1988          10        1102       -1102           .
>>         2        1988          10        1103       -1103          -1
>>         0        1988          50        1104       -1104           .
>>         2        1988          10        1105       -1105          -1
>>         2        1988          10        1107       -1107          -1
>>         2        1988          10        1110       -1110          -1
>>         0        1988          50        1113       -1113           .
>>         2        1988          50        1115       -1115          -1
>>         2        1988          10        1116       -1116          -1
>>         2        1988          10        1118       -1118          -1
>>         0        1988          10        1119       -1119           .
>>         2        1988          10        1120       -1120          -1
>>         0        1986          40        1124       -1124           .
>>         2        1986          10        1127       -1127           1
>>         2        1986          10        1131       -1131           1
>>         2        1986          10        1135       -1135           1
>> end
>> sort time
>> capture drop orde* sina*
>> gen ordem = _n
>> gen ordemnova = -_n
>> sort ordemnova
>> gen sinal2=.
>>
>> forvalues i=1/`=_N' {
>>        if v269[`i']==2 {
>>                local pr = v270[`i']
>>                local qt = v271[`i']
>>                local j=`i'+1
>>                while ((v269[`j']==2) | (v270[`j']!=`pr' | v271[`j']!=`qt')) & (`j'<=`=_N') {
>>                        local ++j
>>                }
>>                if v269[`j']==0 {
>>                        local ordem = -1
>>                }
>>                else if v269[`j']==1 {
>>                        local ordem = 1
>>                }
>>                else {
>>                         local ordem = 0
>>                }
>>                quietly replace sinal2 = `ordem' in `i'
>>        }
>> }
>> sort ordem
>>
>> Description:
>> 1) The variable "sinal2" replicates de desired "sinal"
>> 2) The first entry of v269 in which v269==2 has the pair v270=185 e v271=40.
>> I want to put one of the 3 numbers (-1, 1 or 1) in the variable "sinal".
>> What decides which one is the entry in v269 in other observation: the
>> one that has the same values (v270==185 and v271==40).
>> 3) To do that, I search backwards(in observations) for the pair
>> v270==185 and v271==40, skiping observations that, even though they
>> have the same pair v270, v271, have also v269==2. To conclude, I want
>> to see the first observation that I find when looking backwards,
>> starting from a observation in which v269==2, that have either v269==0
>> or v269==1
>> 4) For the first case in which v269==2 occurs, the looping go
>> backwards 2 observations (2 observations before we have v269==1,
>> v270==185 and v271==40). Seeing this v269==1, I store the value +1 in
>> the local macro "ordem" and then put it in variable sinal.
>>     For the second case in which v269==2 occurs, the looping go
>> backwards 7 observations .
>>     For the third case, the looping go backwards 2 observations.
>> And so on..
>>
>> The problem is that when running this code in a dta-file that has
>> 920,000 lines, time goes by and it seems the task will never end. And
>> I think it's not normal.
>>
>> I wonder if a code without loopings, as you did first, would be able
>> to do what I described, given that It's  perfect possible 1) that we
>> can have consecutive observations v269==2 and, 2) the number of times
>> the macro j is increased can overlap among v269==2 observations.
>>
>> I would thank if one could think with me of this problem. Also it
>> might be usefull for other people..
>>
>> Best,
>> Pedro.
>>
>> 2011/10/4 Nick Cox <njcoxstata@gmail.com>:
>>> I have looked at this again. I am still not sure what you are trying
>>> to do here, but this reproduces your first example:
>>>
>>> clear all
>>> input v_269 v_270 v_271 desired_sinalt
>>> 0 1.4 100 .
>>> 1 1.5 100 .
>>> 0 1.5 95 .
>>> 0 1.4 100 .
>>> 2 1.5 100 1
>>> 1 1.7 98 .
>>> 0 1.2 99 .
>>> 2 1.5 95 -1
>>> 0 1.8 101 .
>>> end
>>> gen long order = _n
>>> gen start = v_269 == 2
>>> gen block = sum(start)
>>> bysort block (order) : ///
>>>        gen match = sum(v_270 == v_270[1] | v_271 == v_271[1])
>>> by block : ///
>>>        replace match = sum(cond(inlist(v_269, 1, 0), v_269  * (match == 1),.))
>>> by block : replace match = match[_N]
>>> by block : gen sinalt = cond(match == 1, 1, cond(match == 0, -1, .)) if block
>>>
>>>
>>>
>>>
>>> On Tue, Oct 4, 2011 at 3:32 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
>>>> I don't fully understand what you are trying to do here, but
>>>>
>>>> local ++j
>>>>
>>>> need not stop before
>>>>
>>>> v_270[`j']==v_270[`i'] | v_271[`j']==v_271[`i']
>>>>
>>>> and perhaps that is not guaranteed for all values of 2.
>>>>
>>>> so perhaps you need another condition to stop it, say that the next value of v_269 is 2.
>>>>
>>>> I think you need another approach. Evidently blocks start with some key values and then you count something within blocks. A few fragmentary suggestions
>>>>
>>>> gen start = v269 == 2
>>>> gen block = sum(start)
>>>> egen start_v269 = total(start * v269), by(block)
>>>> egen start_v270 = total(start * v270), by(block)
>>>> egen start_v271 = total(start * v271), by(block)
>>>>
>>>>
>>>>
>>>> Nick
>>>> n.j.cox@durham.ac.uk
>>>>
>>>> -----Original Message-----
>>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Pedro Nakashima
>>>> Sent: 03 October 2011 20:39
>>>> To: statalist@hsphsun2.harvard.edu
>>>> Subject: Re: st: Looping across observations (forwards and backwards)
>>>>
>>>> Thanks, Nick
>>>>
>>>> When I applied you tip to the code:
>>>>
>>>> clear all
>>>> input v_269 v_270 v_271 desired_sinalt
>>>> 0 1.4 100 .
>>>> 1 1.5 100 .
>>>> 0 1.5 95 .
>>>> 0 1.4 100 .
>>>> 2 1.5 100 1
>>>> 1 1.7 98 .
>>>> 0 1.2 99 .
>>>> 2 1.5 95 -1
>>>> 0 1.8 101 .
>>>> end
>>>> gen order = _n
>>>> gen neworder=-_n
>>>> sort neworder
>>>> gen sinalt=.
>>>> set trace on
>>>> forvalues i=1/`=_N' {
>>>>        if v_269[`i']==2{
>>>>                local j=`i'+1
>>>>                while (v_270[`j']!=v_270[`i'] | v_271[`j']!=v_271[`i']) {
>>>>                        local ++j
>>>>                        }
>>>>                if v_270[`j']==v_270[`i'] | v_271[`j']==v_271[`i'] {
>>>>                        if v_269[`j']==1{
>>>>                                local sinal=1
>>>>                                }
>>>>                        else if  v_269[`j']==0 {
>>>>                                local sinal=-1
>>>>                                }
>>>>                        else {
>>>>                                local sinal=.
>>>>                                }
>>>>                }
>>>>                replace sinalt=`sinal' in `i'
>>>>        }
>>>> }
>>>> set trace off
>>>> sort order
>>>>
>>>> ,, it worked,
>>>>
>>>> But if I replace the third observation as follows:
>>>> replace v_269 = 2 in 3
>>>> replace v_271 = 100 in 3
>>>>
>>>> The looping never ends..
>>>>
>>>> Also, It's important to say that if the criterion matches v_269 and
>>>> v_271 in observation number 3 (where v_269==2), as in the above
>>>> example, I want to ignore it.
>>>>
>>>> Thanks in advance for the help.
>>>>
>>>> Best regards
>>>> Pedro Nakashima.
>>>>
>>>> 2011/9/24 Nick Cox <njcoxstata@gmail.com>:
>>>>> A different comment is that it is much easier to go forwards in Stata
>>>>> than backwards. So, reversing the whole dataset, and defining spells
>>>>> "started" in a certain way might be easier. When all is done you
>>>>> reverse it again.
>>>>>
>>>>> Reversing is easy
>>>>>
>>>>> gen neworder = -_n
>>>>> sort neworder
>>>>>
>>>>> On Sat, Sep 24, 2011 at 4:07 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>>>>> When your program gets to
>>>>>>
>>>>>>      replace sinalt=`sinal' in `i'
>>>>>>
>>>>>> evidently `sinal' is undefined so Stata sees
>>>>>>
>>>>>>      replace sinalt= in `i'
>>>>>>
>>>>>> It tries first to interpret -in- as the name of a variable or scalar,
>>>>>> fails, and aborts with error.
>>>>>>
>>>>>> Perhaps when you coded
>>>>>>
>>>>>>  if cod[j]==1 {
>>>>>>
>>>>>> you meant
>>>>>>
>>>>>>  if cod[`j']==1 {
>>>>>>
>>>>>> On Sat, Sep 24, 2011 at 3:28 PM, pedromfn <nakashimapedro@gmail.com> wrote:
>>>>>>
>>>>>>> My database looks like:
>>>>>>>
>>>>>>> obs cod pr qt sinalt
>>>>>>> 1 1 1.4 100 .
>>>>>>> 2 2 1.5 100 .
>>>>>>> 3 1 1.5 95 .
>>>>>>> 4 1 1.4 100 .
>>>>>>> 5 3 1.5 100 .
>>>>>>>
>>>>>>> and I want to replace observations of sinalt in which cod==3, according to
>>>>>>> the following rule:
>>>>>>> 1) Go across observations looking for observations in which cod=3
>>>>>>> 2) In the above example, the first observation is observation 5, in which
>>>>>>> pr[5]=1.5 and qt[5]=100. Once that observation was found, go backwards
>>>>>>> through observations looking for the first observation j in which
>>>>>>> pr[j]==pr[5] & qt[j]==qt[5]. In the example, j=2.
>>>>>>> 3) Replace sinalt[5]=`sinal' , where the macro sinal is defined as:
>>>>>>>     if cod[j]==1, store in the local sinal the value 1
>>>>>>>     if cod[j]==2, store in the local sinal the value -1
>>>>>>> 4) Once last replace was done, look for the next observation in which cod==3
>>>>>>> and do the same thing.
>>>>>>>
>>>>>>> I wrote the following do-file, but it didn't work:
>>>>>>>
>>>>>>> forvalues i=1/`=_N' {
>>>>>>>        if cod[`i']==3{
>>>>>>>                local j=`i'-1
>>>>>>>                if pr[`j']==pr[`i'] & qt[`j']==qt[`i'] {
>>>>>>>                        if cod[j]==1 {
>>>>>>>                                local sinal 1
>>>>>>>                        }
>>>>>>>                        else if cod[`j']==2 {
>>>>>>>                                local sinal -1
>>>>>>>                        }
>>>>>>>                        else {
>>>>>>>                                local sinal
>>>>>>>                        }
>>>>>>>                }
>>>>>>>                else {
>>>>>>>                        while pr[`j']!=pr[`i'] | qt[`j']!=qt[`i'] {
>>>>>>>                                local --j
>>>>>>>                        }
>>>>>>>                }
>>>>>>>        replace sinalt=`sinal' in `i'
>>>>>>>        }
>>>>>>> }
>>>>>>>
>>>>>>> ERROR:
>>>>>>> in not found
>>>>>>> r(111);
>>>>>>
>>>>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index