Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Observations that keep a feature... an additional problem


From   Nick Cox <njcoxstata@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Observations that keep a feature... an additional problem
Date   Wed, 22 May 2013 18:21:28 +0100

I can't follow this.  I see only "the rules select too many agents".

You tell me your precise rules and I will try to think of code to
implement them.

Nick
njcoxstata@gmail.com


On 22 May 2013 18:16, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
> Nick, after reducing the sample using your suggestion, I have checked the
> number of agents that there are per period. And the number is increasing
> in time. I guess this is due to the fact that agents joining the sample as
> time goes by and satisfying the requirement of being above the threshold
> are not excluded. Is there any trick to avoid including them? Thanks
> again.
>
>> Assuming variable names
>>
>> agent  period  score
>>
>> it seems that you want something like
>>
>> bysort agent (period) : gen first3 = _n < 4
>>
>> egen max_first3 = max(score / first3), by(agent)
>>
>> egen min_rest = min(score / !first3), by(agent)
>>
>> keep if max_first3 > 0.9 & min_rest > 0.9
>>
>> For the division trick in the -egen- call see e.g.
>>
>> http://www.stata.com/statalist/archive/2013-03/msg00917.html
>>
>> (reference included therein).
>>
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 22 May 2013 15:03, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
>>> Nick, thanks for your help. I hope you can help me with another doubt.
>>> For
>>> a similar analysis to that of my first message, assume I want to keep
>>> those agents that that have overpass the threshold before a certain
>>> period
>>> and then have been over it in the rest of the sample period.
>>>
>>> To illustrate the idea, consider the following (data refer to
>>> consecutive
>>> periods and the threshold is, eg, 0.9):
>>>
>>> Agent 1: 1    1    1    1    1...
>>> Agent 2: 0.8  1    1    1    1...
>>> Agent 3: 0.8  0.8  0.8  1    1...
>>> Agent 4: 0.8  0.8  0.8  0.8  1...
>>>
>>> I want to keep the first three agents because they have overpassed the
>>> threshold before period 4 and then have been over the threshold in the
>>> rest of the sample period, but I do not want to keep agent 4.
>>>
>>> Thanks in advance.
>>>
>>> Miguel.
>>>
>>>
>>>
>>>> Correct on -keep-. Sorry about that.
>>>>
>>>> The -sort- order
>>>>
>>>> bysort entity (const_a) :
>>>>
>>>> ensures that -const_a[1]- is the lowest for each agent, not the first.
>>>> If the lowest value for each agent is above the threshold, then all
>>>> the observations for that agent  are above.
>>>> Nick
>>>> njcoxstata@gmail.com
>>>>
>>>>
>>>> On 21 May 2013 23:16, Miguel Angel Duran Munoz <maduran@uma.es> wrote:
>>>>> Thanks, Nick. I guess you mean -keep- instead of -drop-. Nevertheless,
>>>>> the
>>>>> command that you suggest would not guarantee that I keep the agents
>>>>> that
>>>>> have been above the threhsold for the whole sample period (ie, I would
>>>>> be
>>>>> including agents that were above the threshold in the first period and
>>>>> then might have been above or below it).
>>>>>
>>>>>> Sounds like
>>>>>>
>>>>>> bysort entity (const_a) : drop if const_a[1] > 0.09716
>>>>>>
>>>>>> Nick
>>>>>> njcoxstata@gmail.com
>>>>>>
>>>>>> On 21 May 2013 23:01, Miguel Angel Duran Munoz <maduran@uma.es>
>>>>>> wrote:
>>>>>>> Hi, Statalisters. I want to focus on agents in my dataset that have
>>>>>>> a
>>>>>>> particular feature; specifically, for those agents, and for each and
>>>>>>> every
>>>>>>> period (out of 64), the value of a variable (const_a) is larger than
>>>>>>> a
>>>>>>> particular threshold (0.097116). I have done what I show below.
>>>>>>> Nevertheless, I have realized that some of my agents are not in the
>>>>>>> sample
>>>>>>> since the first period, so what I am doing would mistakenly
>>>>>>> eliminate
>>>>>>> them. Will anyone help to solve this problem? Thanks in advance.
>>>>>>>
>>>>>>> bysort entity (date2): gen obs=_n
>>>>>>> drop if const_a<0.097116
>>>>>>> by entity: drop if obs[_N]<64
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index