Nick, thanks for your help. I will try to be clearer. There is no fallacy in your logic argument, but this is not the problem. In addition, what I am showing is a simplified version of the relevant part of my dataset (the whole dataset has 178,410 observations and about 40 variables), just to illustrate what I mean. These are the codes I am using: In this simplified version: -codebook id if mean_var1 != 11- counts both agents (408 in my dataset). -codebook id if mean_var1 != 11 & (var1 == 10/var2 | var1 == 11/var2)- counts 1 agent (id1) (397 in my dataset). But -codebook id if mean_var1 != 11 & !(var1 == 10/var2 | var1 == 11/var2)- also counts both agents. The reason is because -(var1 == 10/var2 | var1 == 11/var2)- focuses on any value of var1 equal to 10 or 11 if var2 == 1 (ie, if startdate == date). Nevertheless, -!(var1 == 10/var2 | var1 == 11/var2)- refers to any observation where var1 and var2 are not equal to 10 or 11 regardless the value of var2. Therefore, observations 1, 2 and 4 for id1 and 5-8 for id2 are taken into account, ie, both agents are counted. What I want to count is agents (i) whose mean_var is not equal to 11, (ii) and have no observation in the date of the startdate (eg, for id2, startdate = 192, but there is no observation for that date). Please, note that the latter requirement is not having a missing value when startdate == date, but that there is no observation. obs id startdate date var1 var2 mean_var1 1 1 189 187 10 . 10.75 2 1 189 188 11 . 10.75 3 1 189 189 11 1 10.75 4 1 189 190 11 . 10.75 5 2 192 189 10 . 10.5 6 2 192 190 10 . 10.5 7 2 192 191 11 . 10.5 8 2 192 193 11 . 10.5 -----Mensaje original----- De: [email protected] [mailto:[email protected]] En nombre de Nick Cox Enviado el: jueves, 16 de enero de 2014 12:31 Para: [email protected] Asunto: Re: st: Counting firms in a panel dataset Sorry, but I am lost here. Clearly I don't have your data and you don't even show your code, nor do I understand in what sense what any code used doesn't work. As I understand it, you want to identify the 11 observations that appear when 408 are selected but do not appear when 397 are selected. I am waving general logic at you, namely that the complement of A & B in A is A & !B and you don't identify a fallacy in that. What are you showing us? It's not 11 observations. Nick [email protected] On 16 January 2014 11:10, Miguel A. Duran <[email protected]> wrote: > Yes, Nick, I tried something quite similar, and I have just tried what > you propose. If I am not mistaken the reason why it doesn't work is > because > -!(var1 == 10/var2 | var1 == 11/var2)- includes observations 1, 2 and > 4 for > id1 and all observations of id2. Therefore, both agents are taken into > account under -codebook id if...- > > obs id startdate date var1 var2 mean_var1 > 1 1 189 187 10 . > 10.75 > 2 1 189 188 11 . > 10.75 > 3 1 189 189 11 1 10.75 > 4 1 189 190 11 . > 10.75 > 5 2 192 189 10 . > 10.5 > 6 2 192 190 10 . > 10.5 > 7 2 192 191 11 . > 10.5 > 8 2 192 193 11 . > 10.5 > > -----Mensaje original----- > De: [email protected] > [mailto:[email protected]] En nombre de Nick Cox > Enviado el: jueves, 16 de enero de 2014 11:51 > Para: [email protected] > Asunto: Re: st: Counting firms in a panel dataset > > Did you try it? As I understand it, the complement of > > A & B > > in A is > > A & !B > > Nick > [email protected] > > > On 16 January 2014 10:36, Miguel A. Duran <[email protected]> wrote: >> Thanks, Nick, for your answer. I thought of something similar to what >> you propose, but if I am not mistaken it has a problem: I would be >> counting both >> id1 and id2, i.e., I would get again 408 (what I get just using >> -codebook id if mean_var1 != 11-). >> >> id startdate date var1 var2 mean_var1 >> 1 189 187 10 . 10.75 >> 1 189 188 11 . 10.75 >> 1 189 189 11 1 10.75 >> 1 189 190 11 . 10.75 >> 2 192 189 10 . 10.5 >> 2 192 190 10 . 10.5 >> 2 192 191 11 . 10.5 >> 2 192 193 11 . 10.5 >> >> -----Mensaje original----- >> De: [email protected] >> [mailto:[email protected]] En nombre de Nick Cox >> Enviado el: miércoles, 15 de enero de 2014 20:28 >> Para: [email protected] >> Asunto: Re: st: Counting firms in a panel dataset >> >> I'd look at data that satisfy >> >> if mean_var1 != 11 & !(var1 == 10/var2 | var1 == 11/var2) >> >> i.e. negating the second condition. Note that if -var1- and -var2- >> are both missing, then the second condition >> >> (var1 == 10/var2 | var1 == 11/var2) >> >> reduces to >> >> . == . >> >> which is always true. >> Nick >> [email protected] >> >> >> On 15 January 2014 19:18, Miguel A. Duran <[email protected]> wrote: >>> Hi, Statlisters. I am using -codebook- to count the number of agents >>> in a panel dataset under different criteria. Under a criterion I get >>> 408 agents and under another one I get 397. I have an intuition >>> about the cause of this difference and I would like to check it out, >>> but I do >> not know how to do it. >>> To help make clear my point, (the relevant part of) my dataset looks >>> similar to this, >>> >>> id startdate date var1 var2 mean_var1 >>> 1 189 187 10 . 10.75 >>> 1 189 188 11 . 10.75 >>> 1 189 189 11 1 10.75 >>> 1 189 190 11 . 10.75 >>> 2 192 189 10 . 10.5 >>> 2 192 190 10 . 10.5 >>> 2 192 191 11 . 10.5 >>> 2 192 193 11 . 10.5 >>> >>> Using the command -codebook id if mean_var1 != 11- I get 408 agents, >>> but using the command -codebook id if mean_var1 != 11 & (var1 == >>> 10/var2 | var1 == 11/var2)- I get 397 agents. My intuition is that >>> this happens because there are agents (like agent 2) that do not >>> have the observation corresponding to the startdate. If I am right >>> adding this requirement to the command -codebook id if mean_var1 != >>> 11- should count 11 agents, but I do not know how to include that > requirement. >> Will anyone please help with this? >>> Thanks in advance. >>> >>> Miguel.

