# Re: st: Observations that keep a feature in the whole sample period

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: Observations that keep a feature in the whole sample period Date Wed, 22 May 2013 15:20:40 +0100

```Assuming variable names

agent  period  score

it seems that you want something like

bysort agent (period) : gen first3 = _n < 4

egen max_first3 = max(score / first3), by(agent)

egen min_rest = min(score / !first3), by(agent)

keep if max_first3 > 0.9 & min_rest > 0.9

For the division trick in the -egen- call see e.g.

http://www.stata.com/statalist/archive/2013-03/msg00917.html

(reference included therein).

Nick
njcoxstata@gmail.com

On 22 May 2013 15:03, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
> Nick, thanks for your help. I hope you can help me with another doubt. For
> a similar analysis to that of my first message, assume I want to keep
> those agents that that have overpass the threshold before a certain period
> and then have been over it in the rest of the sample period.
>
> To illustrate the idea, consider the following (data refer to consecutive
> periods and the threshold is, eg, 0.9):
>
> Agent 1: 1    1    1    1    1...
> Agent 2: 0.8  1    1    1    1...
> Agent 3: 0.8  0.8  0.8  1    1...
> Agent 4: 0.8  0.8  0.8  0.8  1...
>
> I want to keep the first three agents because they have overpassed the
> threshold before period 4 and then have been over the threshold in the
> rest of the sample period, but I do not want to keep agent 4.
>
>
> Miguel.
>
>
>
>> Correct on -keep-. Sorry about that.
>>
>> The -sort- order
>>
>> bysort entity (const_a) :
>>
>> ensures that -const_a[1]- is the lowest for each agent, not the first.
>> If the lowest value for each agent is above the threshold, then all
>> the observations for that agent  are above.
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 21 May 2013 23:16, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
>>> Thanks, Nick. I guess you mean -keep- instead of -drop-. Nevertheless,
>>> the
>>> command that you suggest would not guarantee that I keep the agents that
>>> have been above the threhsold for the whole sample period (ie, I would
>>> be
>>> including agents that were above the threshold in the first period and
>>> then might have been above or below it).
>>>
>>>> Sounds like
>>>>
>>>> bysort entity (const_a) : drop if const_a[1] > 0.09716
>>>>
>>>> Nick
>>>> njcoxstata@gmail.com
>>>>
>>>> On 21 May 2013 23:01, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
>>>>> Hi, Statalisters. I want to focus on agents in my dataset that have a
>>>>> particular feature; specifically, for those agents, and for each and
>>>>> every
>>>>> period (out of 64), the value of a variable (const_a) is larger than a
>>>>> particular threshold (0.097116). I have done what I show below.
>>>>> Nevertheless, I have realized that some of my agents are not in the
>>>>> sample
>>>>> since the first period, so what I am doing would mistakenly eliminate
>>>>> them. Will anyone help to solve this problem? Thanks in advance.
>>>>>
>>>>> bysort entity (date2): gen obs=_n
>>>>> drop if const_a<0.097116
>>>>> by entity: drop if obs[_N]<64
