Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RE: st: Looping within a subset under a certain condition


From   "smztsmzt" <[email protected]>
To   "statalist"<[email protected]>
Subject   Re: RE: st: Looping within a subset under a certain condition
Date   Sun, 30 Sep 2012 21:28:47 +0800

Hi Gerard,

I would suggest that you could set a Flag variable (boolean type: 0 / 1) to stop the loop when it finds your -trandate- and changes from 0 to 1 or 1 to 0. 

Hope it could be useful.

Best wishes,

Pengpeng



发件人:Gerard Solbrig
发送时间:2012-09-30 21:07
主题:RE: st: Looping within a subset under a certain condition
收件人:"statalist"<[email protected]>
抄送:

(in reference to my mails before, concerning your and my code) 

I have given this some thought, why -rep_ins- is set to 0 for all 
observations, using your code. 

The loop runs over all rep = 1 cases and looks into whether the -trandate- 
lies within the range of each rep = 1 case. 
In case of multiple rep = 1 cases with very different dates, it might find 
one rep = 1 case in which's range the current rep = 0 observation's 
-trandate- lies. But the loop does not stop there, if it does find one.  
It keeps on going and due to the sorting of dates, it inevitably finds a 
later rep = 1 case, for which its -trandate- lies outside of the range and 
changes -rep_ins- to 0. 

Is there a way to tell the loop: stop as soon as you find that your 
-trandate- lies in the range of a (or any) rep = 1 case and jump on to the 
next rep = 0 case? If not, a loop might not even be the approach to this 
problem... 

Gerard 


-----Original Message----- 
From: [email protected] 
[mailto:[email protected]] On Behalf Of Nick Cox 
Sent: Sonntag, 30. September 2012 12:48 
To: [email protected] 
Subject: Re: st: Looping within a subset under a certain condition 

Should be 

sort firm rep trandate 

Sorry! 

On Sun, Sep 30, 2012 at 11:27 AM, Nick Cox <[email protected]> wrote: 
> You are not showing me the complete line you typed, so I can't tell  
> you what was wrong exactly. 
> 
> More positively, here is a stab at your problem, but I haven't tested the 
code. 
> 
> sort firm trandate rep 
> 
> gen long obsno = _n 
> 
> * assume all are in some window; will change our mind if we find an  
> exception gen all_in_a_window = 1 
> 
> * numeric ids 1 2 3 ... are just a convenience for looping egen  
> firm_numid = group(firm_id) su firm_numid, meanonly 
> 
> * loop over firms 
> forval f =  1/`r(max)' { 
> 
> * within each firm, which cases have rep == 0 su obsno if firm_numid  
> == `f' & rep == 0, meanonly local z1 = r(min) local z2 = r(max) 
> 
> * ditto, rep == 1 
> su obsno if firm_numid == `f' & rep == 1, meanonly local o1 = r(min)  
> local o2 = r(max) 
> 
> * look at each case of rep == 0 
> forval i = `z1'/`z2' { 
>         local allin = 1 
> 
>                 * we use the -trandate[`i'] and compare it with the  
> windows for each case of rep == 1 
>                 * note the crucial !    [!!!] 
>         forval o = `o1'/`o2' { 
>                 if !inrange[trandate[`i'], win_start[`o'], win_end[`o']) { 
>                         local allin = 0 
>                                 } 
>         } 
> 
>         if `allin' == 0 replace all_in_window = 0 in `i' 
> } 
> 
> } 
> 
> Nick 
> 
> On Sun, Sep 30, 2012 at 11:17 AM, Gerard Solbrig  
> <[email protected]> wrote: 
>> I understand. That's what I did in an earlier version of the loop,  
>> where I subscripted both, -rep- and -trandate- in my loop, but then Stata 
returned: 
>> 
>> '[' invalid obs no 
>> r(198); 
>> 
>> Why is that? That's why I got rid of it in the first place. But  
>> without the subscript, the loop does not seem to finish running. 
>> 
>> 
>> -----Original Message----- 
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Nick Cox 
>> Sent: Sonntag, 30. September 2012 11:59 
>> To: [email protected] 
>> Subject: Re: st: Looping within a subset under a certain condition 
>> 
>> This can't be right, if only because you are misunderstanding what  
>> the 
>> -if- command does. Stata treats 
>> 
>> if rep == 1 
>> 
>> as if it were 
>> 
>> if rep[1] == 1 
>> 
>> See 
>> 
>> FAQ     . . . . . . . . . . . . . . . . . . . . .  if command vs. if 
>> qualifier 
>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  J. 
>> Wernow 
>>         6/00    I have an if command in my program that only seems 
>>                 to evaluate the first observation, what's going on? 
>>                  
>> http://www.stata.com/support/faqs/lang/ifqualifier.html 
>> 
>> The context of looping over observations makes no difference here.  
>> You probably intend 
>> 
>> if rep[`i'] == 1 
>> 
>> Similar comment w.r.t. 
>> 
>> if trandate ... 
>> 
>> where -trandate- _must_ be subscripted. 
>> 
>> 
>> On Sun, Sep 30, 2012 at 10:18 AM, Gerard Solbrig  
>> <[email protected]> wrote: 
>>> That sure is correct. Please see my reply to Pengpeng on that matter. 
>>> So far, I've only focused on getting the rep_ins indicator to work  
>>> at all, but multiple windows for one firm is an additional concern. 
>>> Ideally, a code would indicate for each rep = 0 case within which of  
>>> these windows the observation's 'trandate' lies... 
>>> 
>>> Here's the last version of my code (without inclusion of your  
>>> earlier suggestion and the multiple window problem): 
>>> 
>>> forvalues x = 1/`max' { 
>>>         summarize obs, meanonly 
>>>         local N = r(N) 
>>>         forvalues i = 1/`N' { 
>>>                 if rep == 1 { 
>>>                 local r = `i' 
>>>                 local s = `i'+1 
>>>                 forvalues z = `s'/`N' { 
>>>                         if trandate >= wind_start[`r'] & trandate <=  
>>> wind_end[`r'] { 
>>>                         replace rep_ins = 1 in [`z'] 
>>>                         } 
>>>                         else { 
>>>                         replace rep_ins = 0 in [`z'] 
>>>                         } 
>>>                 } 
>>>         } 
>>> } 
>>> } 
>>> replace rep_ins = . if rep == 1 
>>> 
>>> 
>>> 
>>> -----Original Message----- 
>>> From: [email protected] 
>>> [mailto:[email protected]] On Behalf Of Nick Cox 
>>> Sent: Sonntag, 30. September 2012 11:10 
>>> To: [email protected] 
>>> Subject: Re: st: Looping within a subset under a certain condition 
>>> 
>>> The other thing I wasn't clear on your rules for combining two or  
>>> more windows for the same firm. The code example I gave just uses  
>>> the overall range of the windows, but that would include any gaps  
>>> between windows. Thus if a < b < c < d and there are windows [a,b]  
>>> and [c,d] then the combined window [a, d] includes a gap [b, c]. 
>>> 
>>> On Sun, Sep 30, 2012 at 9:56 AM, Gerard Solbrig  
>>> <[email protected]> wrote: 
>>>> My bad, sorry! Of course, the observation 5apr2004 should not be  
>>>> considered in the window, as it lies outside of the range between  
>>>> 'wind_start' and 'wind_end'. Despite, it seems you've understood my 
>>> problem correctly. 
>>>> 
>>>> I'll try to incorporate your suggestion into a solution and see  
>>>> whether it helps finding a solution. I will post an update on the  
>>>> matter 
>>> later. 
>>>> 
>>>> Thanks so far! 
>>>> 
>>>> 
>>>> -----Original Message----- 
>>>> From: [email protected] 
>>>> [mailto:[email protected]] On Behalf Of Nick Cox 
>>>> Sent: Sonntag, 30. September 2012 01:13 
>>>> To: [email protected] 
>>>> Subject: Re: st: Looping within a subset under a certain condition 
>>>> 
>>>> I had another look at this. I still don't understand your problem  
>>>> exactly (e.g. why is the second obs at 5apr2004 considered in  
>>>> window), but the technique here may help. 
>>>> 
>>>> egen first_start = min(wind_start), by(firm_id) egen last_end =  
>>>> max(wind_end), by(firm_id) 
>>>> 
>>>> gen in_window = inrange(date, first_start, last_end) 
>>>> 
>>>> egen all_0_in_window = min(in_window) if rep == 0, by(firm_id) 
>>>> 
>>>> On the last line: on all <=> min, any <=> max, see 
>>>> 
>>>> FAQ     . . Creating variables recording whether any or all possess 
some 
>>>> char. 
>>>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. 
>> J. 
>>>> Cox 
>>>>         2/03    How do I create a variable recording whether any 
>>>>                 members of a group (or all members of a group) 
>>>>                 possess some characteristic? 
>>>>                 http://www.stata.com/support/faqs/data/anyall.html 
>>>> 
>>>> Nick 
>>>> 
>>>> On Fri, Sep 28, 2012 at 9:45 PM, Gerard Solbrig  
>>>> <[email protected]> wrote: 
>>>>> 
>>>>> I'm encountering a problem for which I seek your help. 
>>>>> 
>>>>> Let me start off with an example from my data (what I want it to  
>>>>> look like in the end), before I explain my particular problem. 
>>>>> 
>>>>> firm_id date            rep     wind_start              wind_end 
>>>>> rep_ins 
>>>>> 
>>>>> firm1           01jan2000       0       .                       . 
>>>>> 0 
>>>>> firm1           05apr2004       0       .                       . 
>>>>> 1 
>>>>> firm1           01nov2004       1       05may2004 
>> 30may2005 
>>>>> . 
>>>>> firm1           10dec2004       0       .                       . 
>>>>> 1 
>>>>> firm1           01jan2006       0       .                       . 
>>>>> 0 
>>>>> firm2           30dec1999       1       03jul1999 
>> 27jul2000 
>>>>> . 
>>>>> firm2           05jan2000       1       09jul1999 
>> 02aug2000 
>>>>> . 
>>>>> firm2           06jun2000       0       .                       . 
>>>>> 1 
>>>>> 
>>>>> Each firm in my data has a 'firm_id'. Variable 'date' refers to an  
>>>>> event date. The 'rep' dummy indicates the type of event. 
>>>>> I set 'wind_start' and 'wind_end' as period around the event  
>>>>> (-180days,+210days), in case it's a rep = 1 type event. 
>>>>> 
>>>>> Now, I would like the 'rep_ins' dummy to indicate (i.e., rep_ins =  
>>>>> 1), whether the date of all other observations of this firm (where  
>>>>> rep = 
>>>>> 0) lies within the range determined by 'wind_start' and 'wind_end' 
>>>>> (which is conditional upon the 'rep' dummy). 
>>>>> 
>>>>> I've come across looping over observations and tried to design a  
>>>>> solution for this problem based on that, but failed to do so. I  
>>>>> assume the solution also depends on sorting the data in a special way. 
>>>>> 
>>>>> Here's the first part of my .do-file: 
>>>>> 
>>>>> gen wind_start = date-180 if rep == 1 gen wind_end = date+210 if  
>>>>> rep == 1 format wind_start %d format wind_end %d gsort +cusip6  
>>>>> +date 
>>>>> +trandate gen rep_ins = 0 if rep != 1 
>>>>> 
>>>>> I tried to come up with a solution by adding variables 'per_start' 
>>>>> and 'per_end' for all rep = 0: 
>>>>> 
>>>>> gen per_start = date-180 if rep == 0 gen per_end = date+180 if rep  
>>>>> == 0 format per_start %d format per_end %d 
>>>>> 
>>>>> To mark the period within which the rep = 1 event can lie. Maybe  
>>>>> this could contribute to finding a solution as well. 
>>>> * 
* 
*   For searches and help try: 
*   http://www.stata.com/help.cgi?search 
*   http://www.stata.com/support/faqs/resources/statalist-faq/ 
*   http://www.ats.ucla.edu/stat/stata/ 


* 
*   For searches and help try: 
*   http://www.stata.com/help.cgi?search 
*   http://www.stata.com/support/faqs/resources/statalist-faq/ 
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index