Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Looping within a subset under a certain condition

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Looping within a subset under a certain condition Date Sun, 30 Sep 2012 11:27:59 +0100

```You are not showing me the complete line you typed, so I can't tell
you what was wrong exactly.

More positively, here is a stab at your problem, but I haven't tested the code.

sort firm trandate rep

gen long obsno = _n

* assume all are in some window; will change our mind if we find an exception
gen all_in_a_window = 1

* numeric ids 1 2 3 ... are just a convenience for looping
egen firm_numid = group(firm_id)
su firm_numid, meanonly

* loop over firms
forval f =  1/`r(max)' {

* within each firm, which cases have rep == 0
su obsno if firm_numid == `f' & rep == 0, meanonly
local z1 = r(min)
local z2 = r(max)

* ditto, rep == 1
su obsno if firm_numid == `f' & rep == 1, meanonly
local o1 = r(min)
local o2 = r(max)

* look at each case of rep == 0
forval i = `z1'/`z2' {
local allin = 1

* we use the -trandate[`i'] and compare it with the
windows for each case of rep == 1
* note the crucial !    [!!!]
forval o = `o1'/`o2' {
if !inrange[trandate[`i'], win_start[`o'], win_end[`o']) {
local allin = 0
}
}

if `allin' == 0 replace all_in_window = 0 in `i'
}

}

Nick

On Sun, Sep 30, 2012 at 11:17 AM, Gerard Solbrig
<gsolbrig@mail.uni-mannheim.de> wrote:
> I understand. That's what I did in an earlier version of the loop, where I
> subscripted both, -rep- and -trandate- in my loop, but then Stata returned:
>
> '[' invalid obs no
> r(198);
>
> Why is that? That's why I got rid of it in the first place. But without the
> subscript, the loop does not seem to finish running.
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
> Sent: Sonntag, 30. September 2012 11:59
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Looping within a subset under a certain condition
>
> This can't be right, if only because you are misunderstanding what the
> -if- command does. Stata treats
>
> if rep == 1
>
> as if it were
>
> if rep[1] == 1
>
> See
>
> FAQ     . . . . . . . . . . . . . . . . . . . . .  if command vs. if
> qualifier
>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  J.
> Wernow
>         6/00    I have an if command in my program that only seems
>                 to evaluate the first observation, what's going on?
>                 http://www.stata.com/support/faqs/lang/ifqualifier.html
>
> The context of looping over observations makes no difference here. You
> probably intend
>
> if rep[`i'] == 1
>
> Similar comment w.r.t.
>
> if trandate ...
>
> where -trandate- _must_ be subscripted.
>
>
> On Sun, Sep 30, 2012 at 10:18 AM, Gerard Solbrig
> <gsolbrig@mail.uni-mannheim.de> wrote:
>> That sure is correct. Please see my reply to Pengpeng on that matter.
>> So far, I've only focused on getting the rep_ins indicator to work at
>> all, but multiple windows for one firm is an additional concern.
>> Ideally, a code would indicate for each rep = 0 case within which of
>> these windows the observation's 'trandate' lies...
>>
>> Here's the last version of my code (without inclusion of your earlier
>> suggestion and the multiple window problem):
>>
>> forvalues x = 1/`max' {
>>         summarize obs, meanonly
>>         local N = r(N)
>>         forvalues i = 1/`N' {
>>                 if rep == 1 {
>>                 local r = `i'
>>                 local s = `i'+1
>>                 forvalues z = `s'/`N' {
>>                         if trandate >= wind_start[`r'] & trandate <=
>> wind_end[`r'] {
>>                         replace rep_ins = 1 in [`z']
>>                         }
>>                         else {
>>                         replace rep_ins = 0 in [`z']
>>                         }
>>                 }
>>         }
>> }
>> }
>> replace rep_ins = . if rep == 1
>>
>>
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
>> Sent: Sonntag, 30. September 2012 11:10
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Looping within a subset under a certain condition
>>
>> The other thing I wasn't clear on your rules for combining two or more
>> windows for the same firm. The code example I gave just uses the
>> overall range of the windows, but that would include any gaps between
>> windows. Thus if a < b < c < d and there are windows [a,b] and [c,d]
>> then the combined window [a, d] includes a gap [b, c].
>>
>> On Sun, Sep 30, 2012 at 9:56 AM, Gerard Solbrig
>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>> My bad, sorry! Of course, the observation 5apr2004 should not be
>>> considered in the window, as it lies outside of the range between
>>> 'wind_start' and 'wind_end'. Despite, it seems you've understood my
>> problem correctly.
>>>
>>> I'll try to incorporate your suggestion into a solution and see
>>> whether it helps finding a solution. I will post an update on the
>>> matter
>> later.
>>>
>>> Thanks so far!
>>>
>>>
>>> -----Original Message-----
>>> From: owner-statalist@hsphsun2.harvard.edu
>>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
>>> Sent: Sonntag, 30. September 2012 01:13
>>> To: statalist@hsphsun2.harvard.edu
>>> Subject: Re: st: Looping within a subset under a certain condition
>>>
>>> I had another look at this. I still don't understand your problem
>>> exactly (e.g. why is the second obs at 5apr2004 considered in
>>> window), but the technique here may help.
>>>
>>> egen first_start = min(wind_start), by(firm_id) egen last_end =
>>> max(wind_end), by(firm_id)
>>>
>>> gen in_window = inrange(date, first_start, last_end)
>>>
>>> egen all_0_in_window = min(in_window) if rep == 0, by(firm_id)
>>>
>>> On the last line: on all <=> min, any <=> max, see
>>>
>>> FAQ     . . Creating variables recording whether any or all possess some
>>> char.
>>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
> J.
>>> Cox
>>>         2/03    How do I create a variable recording whether any
>>>                 members of a group (or all members of a group)
>>>                 possess some characteristic?
>>>                 http://www.stata.com/support/faqs/data/anyall.html
>>>
>>> Nick
>>>
>>> On Fri, Sep 28, 2012 at 9:45 PM, Gerard Solbrig
>>> <gsolbrig@mail.uni-mannheim.de> wrote:
>>>>
>>>> I'm encountering a problem for which I seek your help.
>>>>
>>>> Let me start off with an example from my data (what I want it to
>>>> look like in the end), before I explain my particular problem.
>>>>
>>>> firm_id date            rep     wind_start              wind_end
>>>> rep_ins
>>>>
>>>> firm1           01jan2000       0       .                       .
>>>> 0
>>>> firm1           05apr2004       0       .                       .
>>>> 1
>>>> firm1           01nov2004       1       05may2004
> 30may2005
>>>> .
>>>> firm1           10dec2004       0       .                       .
>>>> 1
>>>> firm1           01jan2006       0       .                       .
>>>> 0
>>>> firm2           30dec1999       1       03jul1999
> 27jul2000
>>>> .
>>>> firm2           05jan2000       1       09jul1999
> 02aug2000
>>>> .
>>>> firm2           06jun2000       0       .                       .
>>>> 1
>>>>
>>>> Each firm in my data has a 'firm_id'. Variable 'date' refers to an
>>>> event date. The 'rep' dummy indicates the type of event.
>>>> I set 'wind_start' and 'wind_end' as period around the event
>>>> (-180days,+210days), in case it's a rep = 1 type event.
>>>>
>>>> Now, I would like the 'rep_ins' dummy to indicate (i.e., rep_ins =
>>>> 1), whether the date of all other observations of this firm (where
>>>> rep =
>>>> 0) lies within the range determined by 'wind_start' and 'wind_end'
>>>> (which is conditional upon the 'rep' dummy).
>>>>
>>>> I've come across looping over observations and tried to design a
>>>> solution for this problem based on that, but failed to do so. I
>>>> assume the solution also depends on sorting the data in a special way.
>>>>
>>>> Here's the first part of my .do-file:
>>>>
>>>> gen wind_start = date-180 if rep == 1 gen wind_end = date+210 if rep
>>>> == 1 format wind_start %d format wind_end %d gsort +cusip6 +date
>>>> +trandate gen rep_ins = 0 if rep != 1
>>>>
>>>> I tried to come up with a solution by adding variables 'per_start'
>>>> and 'per_end' for all rep = 0:
>>>>
>>>> gen per_start = date-180 if rep == 0 gen per_end = date+180 if rep
>>>> == 0 format per_start %d format per_end %d
>>>>
>>>> To mark the period within which the rep = 1 event can lie. Maybe
>>>> this could contribute to finding a solution as well.
>>> *
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```