[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: counting the cases for which the obervations of two different variables are equal |

Date |
Wed, 26 Mar 2008 20:29:51 -0000 |

Expressions of the form a == b == c are not necessarily illegal but they won't reliably do what you want here. You want a == b == c to mean (a == b) & (b == c) but to Stata == is a binary operator and this interpretation does not hold. Otherwise put, two == within a trio of arguments don't define a ternary operator. Variables aside, the main issue can be seen by considering . di 1 == 1 == 1 1 . di 0 == 0 == 0 0 The first behaves as you are hoping, but not the second. Why? Let's guess that Stata evaluates left to right here. Even if that's the wrong way round, the examples will come out the same. Then 1 == 1 == 1 is treated as (1 == 1) == 1 which is 1 == 1, which is 1. But 0 == 0 == 0 is treated as (0 == 0) == 0 which is 1 == 0, which is 0. There are various ways forward that I can suggest. One is that you have to spell out all compound true and false statements in terms of atomic binary comparisons using (e.g.) & as well as ==. Another is to use some quite different approach. For example, consistency of y within groups of x is explored by . bysort y (x) : gen same_x = x[1] == x[_N] Another is to tag duplicates using -duplicates- and then look at the others. There is more at How do I list observations in a group that differ on a variable? http://www.stata.com/support/faqs/data/diff.html How do I compute the number of distinct observations? http://www.stata.com/support/faqs/data/distinct.html and also various other data management FAQs. Nick n.j.cox@durham.ac.uk minimus I would like to count the number of cases for which the observations inside two different variables are equal. That is: Suppose you have a panel dataset where the variable 'respondent number' repeats itself for some years (because the same person is observed for several years) and therefore the variable 'sex' repeats itself too. So if in the column of respndent number it reads for two obervations: 1011 1011 then the corresponding column the sex reads male male . Ok now I would like to check if the 'sex' variable is consistent in the data. To do this i use the following command: count if respnr[_n] == respnr[_n+1] & aa001[_n] == aa001[_n+1] and it returns me a number, which is fine. This command returns me the number of cases where a respondent is observed for any two years and the sex of the respondent was same for those two years. Now, I also ask the same question for '3 consecutive' years and use the following command: count if respnr[_n] == respnr[_n+1] == respnr[_n+2] & aa001[_n] == aa001[_n+1] == aa001[_n+2] Although I can see in the data browser that this condition holds for many observations, stata returns "0" cases. That is, although I determine cases where an individual is observed for three years and his sex is male for those years, stat does not see that and return "0". Why? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: counting the cases for which the obervations of two different variables are equal***From:*minimus <t24680@hotmail.com>

- Prev by Date:
**Re: st: MAC Questions** - Next by Date:
**st: installing gam** - Previous by thread:
**Re: st: counting the cases for which the obervations of two different variables are equal** - Next by thread:
**st: installing gam** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |