Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Looping within a subset under a certain condition


From   "Gerard Solbrig" <gsolbrig@mail.uni-mannheim.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Looping within a subset under a certain condition
Date   Sun, 30 Sep 2012 20:28:40 +0200

I'm sorry, but I've been trying for hours now: Stata yields me "invalid
syntax r(198);" every time I try to run this code:

sort cusip6 rep date
gen obs = _n
gen rep_ins = 0
egen firm_numid = group(cusip6)
summarize firm_numid, meanonly
forvalues x = 1/`r(max)' {
	su obs if firm_numid == `x' & rep == 0, meanonly
	local z1 = r(min)
	local z2 = r(max)
	su obs if firm_numid == `x' & rep == 1, meanonly
	local o1 = r(min)
	local o2 = r(max)
	forvalues i = `z1'/`z2' {
		local isin = 1
		forvalues o = `o1'/`o2' {
			if inrange(trandate[`i'], wind_start[`o'],
wind_end[`o']) {
			local isin = 0
			}
		if `isin' == 1 replace rep_ins = 1 in `i'
		}
	}
}

Despite countless tries and modifications, I cannot find the mistake in the
syntax. I simply don't know what is supposed to be wrong here.
I know this code should be working the way I need it...

Many thanks in advance.
Gerard
 

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Sonntag, 30. September 2012 15:36
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Looping within a subset under a certain condition

Also, you can jump out of the loops if you like with -continue- statements.

Nick

On Sun, Sep 30, 2012 at 2:31 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> The code is testing whether every case of 0 is within all the windows 
> defined by cases of 1, which I thought was what you wanted.
>
> That is not what you want, it seems.
>
> If you are happy that a case of 0 is within at least one of the 
> windows defined by cases of 1, then the code is different.
>
> sort firm rep trandate
>
> gen long obsno = _n
>
> * assume not in a window; will change our mind if we find an exception 
> gen in_a_window = 0
>
> * numeric ids 1 2 3 ... are just a convenience for looping egen 
> firm_numid = group(firm_id) su firm_numid, meanonly
>
> * loop over firms
> forval f =  1/`r(max)' {
>
> * within each firm, which cases have rep == 0 su obsno if firm_numid 
> == `f' & rep == 0, meanonly local z1 = r(min) local z2 = r(max)
>
> * ditto, rep == 1
> su obsno if firm_numid == `f' & rep == 1, meanonly local o1 = r(min) 
> local o2 = r(max)
>
> * look at each case of rep == 0
> forval i = `z1'/`z2' {
>         local isin = 0
>
>                 * we use the -trandate[`i'] and compare it with the 
> windows for each case of rep == 1
>                forval o = `o1'/`o2' {
>                 if inrange[trandate[`i'], win_start[`o'], win_end[`o']) {
>                         local isin = 1
>                   }
>         }
>
>         if `isin' replace in_a_window = 1 in `i'
> }
>
> If you then want to check that _all_ cases of rep==0 for each firm_id 
> are within a window
>
> egen all_in_window = min(in_a_window / (rep == 0)) , by(firm_id)
>
> Nick
>
> On Sun, Sep 30, 2012 at 2:05 PM, Gerard Solbrig 
> <gsolbrig@mail.uni-mannheim.de> wrote:
>> (in reference to my mails before, concerning your and my code)
>>
>> I have given this some thought, why -rep_ins- is set to 0 for all 
>> observations, using your code.
>>
>> The loop runs over all rep = 1 cases and looks into whether the 
>> -trandate- lies within the range of each rep = 1 case.
>> In case of multiple rep = 1 cases with very different dates, it might 
>> find one rep = 1 case in which's range the current rep = 0 
>> observation's
>> -trandate- lies. But the loop does not stop there, if it does find one.
>> It keeps on going and due to the sorting of dates, it inevitably 
>> finds a later rep = 1 case, for which its -trandate- lies outside of 
>> the range and changes -rep_ins- to 0.
>>
>> Is there a way to tell the loop: stop as soon as you find that your
>> -trandate- lies in the range of a (or any) rep = 1 case and jump on 
>> to the next rep = 0 case? If not, a loop might not even be the 
>> approach to this problem...
>>
>> Gerard
>>
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
>> Sent: Sonntag, 30. September 2012 12:48
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Looping within a subset under a certain condition
>>
>> Should be
>>
>> sort firm rep trandate
>>
>> Sorry!
>>
>> On Sun, Sep 30, 2012 at 11:27 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>>> You are not showing me the complete line you typed, so I can't tell 
>>> you what was wrong exactly.
>>>
>>> More positively, here is a stab at your problem, but I haven't 
>>> tested the
>> code.
>>>
>>> sort firm trandate rep
>>>
>>> gen long obsno = _n
>>>
>>> * assume all are in some window; will change our mind if we find an 
>>> exception gen all_in_a_window = 1
>>>
>>> * numeric ids 1 2 3 ... are just a convenience for looping egen 
>>> firm_numid = group(firm_id) su firm_numid, meanonly
>>>
>>> * loop over firms
>>> forval f =  1/`r(max)' {
>>>
>>> * within each firm, which cases have rep == 0 su obsno if firm_numid 
>>> == `f' & rep == 0, meanonly local z1 = r(min) local z2 = r(max)
>>>
>>> * ditto, rep == 1
>>> su obsno if firm_numid == `f' & rep == 1, meanonly local o1 = r(min) 
>>> local o2 = r(max)
>>>
>>> * look at each case of rep == 0
>>> forval i = `z1'/`z2' {
>>>         local allin = 1
>>>
>>>                 * we use the -trandate[`i'] and compare it with the 
>>> windows for each case of rep == 1
>>>                 * note the crucial !    [!!!]
>>>         forval o = `o1'/`o2' {
>>>                 if !inrange[trandate[`i'], win_start[`o'], win_end[`o'])
{
>>>                         local allin = 0
>>>                                 }
>>>         }
>>>
>>>         if `allin' == 0 replace all_in_window = 0 in `i'
>>> }
>>>
>>> }
>>>
>>> Nick
>>>
>>> On Sun, Sep 30, 2012 at 11:17 AM, Gerard Solbrig 
>>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>>> I understand. That's what I did in an earlier version of the loop, 
>>>> where I subscripted both, -rep- and -trandate- in my loop, but then 
>>>> Stata
>> returned:
>>>>
>>>> '[' invalid obs no
>>>> r(198);
>>>>
>>>> Why is that? That's why I got rid of it in the first place. But 
>>>> without the subscript, the loop does not seem to finish running.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: owner-statalist@hsphsun2.harvard.edu
>>>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
>>>> Sent: Sonntag, 30. September 2012 11:59
>>>> To: statalist@hsphsun2.harvard.edu
>>>> Subject: Re: st: Looping within a subset under a certain condition
>>>>
>>>> This can't be right, if only because you are misunderstanding what 
>>>> the
>>>> -if- command does. Stata treats
>>>>
>>>> if rep == 1
>>>>
>>>> as if it were
>>>>
>>>> if rep[1] == 1
>>>>
>>>> See
>>>>
>>>> FAQ     . . . . . . . . . . . . . . . . . . . . .  if command vs. if
>>>> qualifier
>>>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  J.
>>>> Wernow
>>>>         6/00    I have an if command in my program that only seems
>>>>                 to evaluate the first observation, what's going on?
>>>>
>>>> http://www.stata.com/support/faqs/lang/ifqualifier.html
>>>>
>>>> The context of looping over observations makes no difference here.
>>>> You probably intend
>>>>
>>>> if rep[`i'] == 1
>>>>
>>>> Similar comment w.r.t.
>>>>
>>>> if trandate ...
>>>>
>>>> where -trandate- _must_ be subscripted.
>>>>
>>>>
>>>> On Sun, Sep 30, 2012 at 10:18 AM, Gerard Solbrig 
>>>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>>>> That sure is correct. Please see my reply to Pengpeng on that matter.
>>>>> So far, I've only focused on getting the rep_ins indicator to work 
>>>>> at all, but multiple windows for one firm is an additional concern.
>>>>> Ideally, a code would indicate for each rep = 0 case within which 
>>>>> of these windows the observation's 'trandate' lies...
>>>>>
>>>>> Here's the last version of my code (without inclusion of your 
>>>>> earlier suggestion and the multiple window problem):
>>>>>
>>>>> forvalues x = 1/`max' {
>>>>>         summarize obs, meanonly
>>>>>         local N = r(N)
>>>>>         forvalues i = 1/`N' {
>>>>>                 if rep == 1 {
>>>>>                 local r = `i'
>>>>>                 local s = `i'+1
>>>>>                 forvalues z = `s'/`N' {
>>>>>                         if trandate >= wind_start[`r'] & trandate 
>>>>> <= wind_end[`r'] {
>>>>>                         replace rep_ins = 1 in [`z']
>>>>>                         }
>>>>>                         else {
>>>>>                         replace rep_ins = 0 in [`z']
>>>>>                         }
>>>>>                 }
>>>>>         }
>>>>> }
>>>>> }
>>>>> replace rep_ins = . if rep == 1
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: owner-statalist@hsphsun2.harvard.edu
>>>>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick 
>>>>> Cox
>>>>> Sent: Sonntag, 30. September 2012 11:10
>>>>> To: statalist@hsphsun2.harvard.edu
>>>>> Subject: Re: st: Looping within a subset under a certain condition
>>>>>
>>>>> The other thing I wasn't clear on your rules for combining two or 
>>>>> more windows for the same firm. The code example I gave just uses 
>>>>> the overall range of the windows, but that would include any gaps 
>>>>> between windows. Thus if a < b < c < d and there are windows [a,b] 
>>>>> and [c,d] then the combined window [a, d] includes a gap [b, c].
>>>>>
>>>>> On Sun, Sep 30, 2012 at 9:56 AM, Gerard Solbrig 
>>>>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>>>>> My bad, sorry! Of course, the observation 5apr2004 should not be 
>>>>>> considered in the window, as it lies outside of the range between 
>>>>>> 'wind_start' and 'wind_end'. Despite, it seems you've understood 
>>>>>> my
>>>>> problem correctly.
>>>>>>
>>>>>> I'll try to incorporate your suggestion into a solution and see 
>>>>>> whether it helps finding a solution. I will post an update on the 
>>>>>> matter
>>>>> later.
>>>>>>
>>>>>> Thanks so far!
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: owner-statalist@hsphsun2.harvard.edu
>>>>>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick 
>>>>>> Cox
>>>>>> Sent: Sonntag, 30. September 2012 01:13
>>>>>> To: statalist@hsphsun2.harvard.edu
>>>>>> Subject: Re: st: Looping within a subset under a certain 
>>>>>> condition
>>>>>>
>>>>>> I had another look at this. I still don't understand your problem 
>>>>>> exactly (e.g. why is the second obs at 5apr2004 considered in 
>>>>>> window), but the technique here may help.
>>>>>>
>>>>>> egen first_start = min(wind_start), by(firm_id) egen last_end = 
>>>>>> max(wind_end), by(firm_id)
>>>>>>
>>>>>> gen in_window = inrange(date, first_start, last_end)
>>>>>>
>>>>>> egen all_0_in_window = min(in_window) if rep == 0, by(firm_id)
>>>>>>
>>>>>> On the last line: on all <=> min, any <=> max, see
>>>>>>
>>>>>> FAQ     . . Creating variables recording whether any or all possess
>> some
>>>>>> char.
>>>>>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
N.
>>>> J.
>>>>>> Cox
>>>>>>         2/03    How do I create a variable recording whether any
>>>>>>                 members of a group (or all members of a group)
>>>>>>                 possess some characteristic?
>>>>>>                 
>>>>>> http://www.stata.com/support/faqs/data/anyall.html
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>> On Fri, Sep 28, 2012 at 9:45 PM, Gerard Solbrig 
>>>>>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>>>>>>
>>>>>>> I'm encountering a problem for which I seek your help.
>>>>>>>
>>>>>>> Let me start off with an example from my data (what I want it to 
>>>>>>> look like in the end), before I explain my particular problem.
>>>>>>>
>>>>>>> firm_id date            rep     wind_start              wind_end
>>>>>>> rep_ins
>>>>>>>
>>>>>>> firm1           01jan2000       0       .                       .
>>>>>>> 0
>>>>>>> firm1           05apr2004       0       .                       .
>>>>>>> 1
>>>>>>> firm1           01nov2004       1       05may2004
>>>> 30may2005
>>>>>>> .
>>>>>>> firm1           10dec2004       0       .                       .
>>>>>>> 1
>>>>>>> firm1           01jan2006       0       .                       .
>>>>>>> 0
>>>>>>> firm2           30dec1999       1       03jul1999
>>>> 27jul2000
>>>>>>> .
>>>>>>> firm2           05jan2000       1       09jul1999
>>>> 02aug2000
>>>>>>> .
>>>>>>> firm2           06jun2000       0       .                       .
>>>>>>> 1
>>>>>>>
>>>>>>> Each firm in my data has a 'firm_id'. Variable 'date' refers to 
>>>>>>> an event date. The 'rep' dummy indicates the type of event.
>>>>>>> I set 'wind_start' and 'wind_end' as period around the event 
>>>>>>> (-180days,+210days), in case it's a rep = 1 type event.
>>>>>>>
>>>>>>> Now, I would like the 'rep_ins' dummy to indicate (i.e., rep_ins 
>>>>>>> = 1), whether the date of all other observations of this firm 
>>>>>>> (where rep =
>>>>>>> 0) lies within the range determined by 'wind_start' and 'wind_end'
>>>>>>> (which is conditional upon the 'rep' dummy).
>>>>>>>
>>>>>>> I've come across looping over observations and tried to design a 
>>>>>>> solution for this problem based on that, but failed to do so. I 
>>>>>>> assume the solution also depends on sorting the data in a special
way.
>>>>>>>
>>>>>>> Here's the first part of my .do-file:
>>>>>>>
>>>>>>> gen wind_start = date-180 if rep == 1 gen wind_end = date+210 if 
>>>>>>> rep == 1 format wind_start %d format wind_end %d gsort +cusip6
>>>>>>> +date
>>>>>>> +trandate gen rep_ins = 0 if rep != 1
>>>>>>>
>>>>>>> I tried to come up with a solution by adding variables 'per_start'
>>>>>>> and 'per_end' for all rep = 0:
>>>>>>>
>>>>>>> gen per_start = date-180 if rep == 0 gen per_end = date+180 if 
>>>>>>> rep == 0 format per_start %d format per_end %d
>>>>>>>
>>>>>>> To mark the period within which the rep = 1 event can lie. Maybe 
>>>>>>> this could contribute to finding a solution as well.
>>>>>> *
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index