Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Observations that keep a feature... an additional problem


From   "Miguel Angel Duran Munoz" <[email protected]>
To   [email protected]
Subject   Re: st: Observations that keep a feature... an additional problem
Date   Wed, 22 May 2013 20:00:29 +0200 (CEST)

I use the same example than in a previous message, but I add a fifth agent
that joins in period six:


Agent 1: 1    1    1    1    1    1...
Agent 2: 0.8  1    1    1    1    1...
Agent 3: 0.8  0.8  0.8  1    1    1...
Agent 4: 0.8  0.8  0.8  0.8  1    1...
Agent 5:  .    .    .    .   .    1...

I want to keep just the first three agents.


If you don't mind, Nick, I would also like to ask you the following. I
take the same example, but I focus on the last periods.

Agent 1: ...1    1    1    1    1    1
Agent 2: ...0.8  1    1    1    1    1
Agent 3: ...0.8  0.8  0.8  1    1    1
Agent 4: ...0.8  0.8  0.8  0.8  1    1
Agent 5: ... .    .    .    .   .    1
Agent 6: ...0.8  0.8  0.8  0.8  1    0.8

I would like to select those agents that overpass the threshold of 0.9 in
any the last two periods and are over the threshold until the end of the
sample period (ie, agents 4 and 5).
I have tried to modify the commands that you have suggested me before, but
I have not been able to get the right selection. Would you mind helping me
with this? Thank you very much.

> I can't follow this.  I see only "the rules select too many agents".
>
> You tell me your precise rules and I will try to think of code to
> implement them.
>
> Nick
> [email protected]
>
>
> On 22 May 2013 18:16, Miguel Angel Duran Munoz <[email protected]> wrote:
>> Nick, after reducing the sample using your suggestion, I have checked
>> the
>> number of agents that there are per period. And the number is increasing
>> in time. I guess this is due to the fact that agents joining the sample
>> as
>> time goes by and satisfying the requirement of being above the threshold
>> are not excluded. Is there any trick to avoid including them? Thanks
>> again.
>>
>>> Assuming variable names
>>>
>>> agent  period  score
>>>
>>> it seems that you want something like
>>>
>>> bysort agent (period) : gen first3 = _n < 4
>>>
>>> egen max_first3 = max(score / first3), by(agent)
>>>
>>> egen min_rest = min(score / !first3), by(agent)
>>>
>>> keep if max_first3 > 0.9 & min_rest > 0.9
>>>
>>> For the division trick in the -egen- call see e.g.
>>>
>>> http://www.stata.com/statalist/archive/2013-03/msg00917.html
>>>
>>> (reference included therein).
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 22 May 2013 15:03, Miguel Angel Duran Munoz <[email protected]> wrote:
>>>> Nick, thanks for your help. I hope you can help me with another doubt.
>>>> For
>>>> a similar analysis to that of my first message, assume I want to keep
>>>> those agents that that have overpass the threshold before a certain
>>>> period
>>>> and then have been over it in the rest of the sample period.
>>>>
>>>> To illustrate the idea, consider the following (data refer to
>>>> consecutive
>>>> periods and the threshold is, eg, 0.9):
>>>>
>>>> Agent 1: 1    1    1    1    1...
>>>> Agent 2: 0.8  1    1    1    1...
>>>> Agent 3: 0.8  0.8  0.8  1    1...
>>>> Agent 4: 0.8  0.8  0.8  0.8  1...
>>>>
>>>> I want to keep the first three agents because they have overpassed the
>>>> threshold before period 4 and then have been over the threshold in the
>>>> rest of the sample period, but I do not want to keep agent 4.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Miguel.
>>>>
>>>>
>>>>
>>>>> Correct on -keep-. Sorry about that.
>>>>>
>>>>> The -sort- order
>>>>>
>>>>> bysort entity (const_a) :
>>>>>
>>>>> ensures that -const_a[1]- is the lowest for each agent, not the
>>>>> first.
>>>>> If the lowest value for each agent is above the threshold, then all
>>>>> the observations for that agent  are above.
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>>
>>>>> On 21 May 2013 23:16, Miguel Angel Duran Munoz <[email protected]>
>>>>> wrote:
>>>>>> Thanks, Nick. I guess you mean -keep- instead of -drop-.
>>>>>> Nevertheless,
>>>>>> the
>>>>>> command that you suggest would not guarantee that I keep the agents
>>>>>> that
>>>>>> have been above the threhsold for the whole sample period (ie, I
>>>>>> would
>>>>>> be
>>>>>> including agents that were above the threshold in the first period
>>>>>> and
>>>>>> then might have been above or below it).
>>>>>>
>>>>>>> Sounds like
>>>>>>>
>>>>>>> bysort entity (const_a) : drop if const_a[1] > 0.09716
>>>>>>>
>>>>>>> Nick
>>>>>>> [email protected]
>>>>>>>
>>>>>>> On 21 May 2013 23:01, Miguel Angel Duran Munoz <[email protected]>
>>>>>>> wrote:
>>>>>>>> Hi, Statalisters. I want to focus on agents in my dataset that
>>>>>>>> have
>>>>>>>> a
>>>>>>>> particular feature; specifically, for those agents, and for each
>>>>>>>> and
>>>>>>>> every
>>>>>>>> period (out of 64), the value of a variable (const_a) is larger
>>>>>>>> than
>>>>>>>> a
>>>>>>>> particular threshold (0.097116). I have done what I show below.
>>>>>>>> Nevertheless, I have realized that some of my agents are not in
>>>>>>>> the
>>>>>>>> sample
>>>>>>>> since the first period, so what I am doing would mistakenly
>>>>>>>> eliminate
>>>>>>>> them. Will anyone help to solve this problem? Thanks in advance.
>>>>>>>>
>>>>>>>> bysort entity (date2): gen obs=_n
>>>>>>>> drop if const_a<0.097116
>>>>>>>> by entity: drop if obs[_N]<64
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index