Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Counting firms in a panel dataset


From   "Miguel A. Duran" <[email protected]>
To   <[email protected]>
Subject   RE: st: Counting firms in a panel dataset
Date   Thu, 16 Jan 2014 13:10:02 +0100

Nick, thanks for your help. I will try to be clearer. There is no fallacy in
your logic argument, but this is not the problem. In addition, what I am
showing is a simplified version of the relevant part of my dataset (the
whole dataset has 178,410 observations and about 40 variables), just to
illustrate what I mean.
These are the codes I am using:
In this simplified version:
-codebook id if mean_var1 != 11- counts both agents (408 in my dataset).
-codebook id if mean_var1 != 11 & (var1 == 10/var2 | var1 == 11/var2)-
counts 1 agent (id1) (397 in my dataset).
But -codebook id if mean_var1 != 11 & !(var1 == 10/var2 | var1 == 11/var2)-
also counts both agents. The reason is because -(var1 == 10/var2 | var1 ==
11/var2)- focuses on any value of var1 equal to 10 or 11 if var2 == 1 (ie,
if startdate == date).  Nevertheless, -!(var1 == 10/var2 | var1 == 11/var2)-
refers to any observation where var1 and var2 are not equal to 10 or 11
regardless the value of var2. Therefore, observations 1, 2 and 4 for id1 and
5-8 for id2 are taken into account, ie, both agents are counted.
What I want to count is agents (i) whose mean_var is not equal to 11, (ii)
and have no observation in the date of the startdate (eg, for id2, startdate
= 192, but there is no observation for that date). Please, note that the
latter requirement is not having a missing value when startdate == date, but
that there is no observation.

obs  id     startdate    date   var1      var2    mean_var1
 1      1           189          187     10           .         10.75
 2      1           189          188     11           .         10.75
 3      1           189          189     11           1        10.75
 4      1           189          190     11           .         10.75
 5      2           192          189     10           .         10.5
 6      2           192          190     10           .         10.5
 7      2           192          191     11           .         10.5
 8      2           192          193     11           .         10.5

-----Mensaje original-----
De: [email protected]
[mailto:[email protected]] En nombre de Nick Cox
Enviado el: jueves, 16 de enero de 2014 12:31
Para: [email protected]
Asunto: Re: st: Counting firms in a panel dataset

Sorry, but I am lost here. Clearly I don't have your data and you don't even
show your code, nor do I understand in what sense what any code used doesn't
work.

As I understand it, you want to identify the 11 observations that appear
when 408 are selected but do not appear when 397 are selected.
I am waving general logic at you, namely that

the complement of A & B in A is A & !B

and you don't identify a fallacy in that.

 What are you showing us? It's not 11 observations.
Nick
[email protected]


On 16 January 2014 11:10, Miguel A. Duran <[email protected]> wrote:
> Yes, Nick, I tried something quite similar, and I have just tried what 
> you propose. If I am not mistaken the reason why it doesn't work is 
> because
> -!(var1 == 10/var2 | var1 == 11/var2)- includes observations 1, 2 and 
> 4 for
> id1 and all observations of id2. Therefore, both agents are taken into 
> account under -codebook id if...-
>
> obs  id     startdate    date   var1      var2       mean_var1
>    1      1           189          187     10           .
> 10.75
>    2      1           189          188     11           .
> 10.75
>    3      1           189          189     11           1
10.75
>    4       1           189          190     11           .
> 10.75
>    5       2           192          189     10           .
> 10.5
>    6       2           192          190     10           .
> 10.5
>    7       2           192          191     11           .
> 10.5
>    8       2           192          193     11           .
> 10.5
>
> -----Mensaje original-----
> De: [email protected]
> [mailto:[email protected]] En nombre de Nick Cox 
> Enviado el: jueves, 16 de enero de 2014 11:51
> Para: [email protected]
> Asunto: Re: st: Counting firms in a panel dataset
>
> Did you try it? As I understand it, the complement of
>
> A & B
>
> in A is
>
> A & !B
>
> Nick
> [email protected]
>
>
> On 16 January 2014 10:36, Miguel A. Duran <[email protected]> wrote:
>> Thanks, Nick, for your answer. I thought of something similar to what 
>> you propose, but if I am not mistaken it has a problem: I would be 
>> counting both
>> id1 and id2, i.e., I would get again 408 (what I get just using 
>> -codebook id if mean_var1 != 11-).
>>
>> id     startdate    date   var1      var2       mean_var1
>>  1           189          187     10           .               10.75
>>  1           189          188     11           .               10.75
>>  1           189          189     11           1              10.75
>>  1           189          190     11           .               10.75
>>  2           192          189     10           .               10.5
>>  2           192          190     10           .               10.5
>>  2           192          191     11           .               10.5
>>  2           192          193     11           .               10.5
>>
>> -----Mensaje original-----
>> De: [email protected]
>> [mailto:[email protected]] En nombre de Nick Cox 
>> Enviado el: miércoles, 15 de enero de 2014 20:28
>> Para: [email protected]
>> Asunto: Re: st: Counting firms in a panel dataset
>>
>> I'd look at data that satisfy
>>
>> if mean_var1 != 11 & !(var1 == 10/var2 | var1 == 11/var2)
>>
>> i.e. negating the second condition. Note that if -var1- and -var2- 
>> are both missing, then the second condition
>>
>> (var1 == 10/var2 | var1 == 11/var2)
>>
>> reduces to
>>
>> . == .
>>
>> which is always true.
>> Nick
>> [email protected]
>>
>>
>> On 15 January 2014 19:18, Miguel A. Duran <[email protected]> wrote:
>>> Hi, Statlisters. I am using -codebook- to count the number of agents 
>>> in a panel dataset under different criteria. Under a criterion I get
>>> 408 agents and under another one I get 397. I have an intuition 
>>> about the cause of this difference and I would like to check it out, 
>>> but I do
>> not know how to do it.
>>> To help make clear my point, (the relevant part of) my dataset looks 
>>> similar to this,
>>>
>>> id     startdate    date   var1      var2       mean_var1
>>> 1           189          187     10           .               10.75
>>> 1           189          188     11           .               10.75
>>> 1           189          189     11           1              10.75
>>> 1           189          190     11           .               10.75
>>> 2           192          189     10           .               10.5
>>> 2           192          190     10           .               10.5
>>> 2           192          191     11           .               10.5
>>> 2           192          193     11           .               10.5
>>>
>>> Using the command -codebook id if mean_var1 != 11- I get 408 agents, 
>>> but using the command -codebook id if mean_var1 != 11 & (var1 ==
>>> 10/var2 | var1 == 11/var2)- I get 397 agents. My intuition is that 
>>> this happens because there are agents (like agent 2) that do not 
>>> have the observation corresponding to the startdate. If I am right 
>>> adding this requirement to the command -codebook id if mean_var1 != 
>>> 11- should count 11 agents, but I do not know how to include that
> requirement.
>> Will anyone please help with this?
>>> Thanks in advance.
>>>
>>> Miguel.
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index