[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: What's the added value of having -in- subset the data before -if- does? |

Date |
Wed, 4 Feb 2009 18:02:33 -0000 |

I've never used SAS and in any case would bow to your understanding of it. But I think your opening is the wrong way to think about how Stata works. In Stata -if- and -in are orthogonal; there is no logical sense in which one has priority over the other. (Whatever happens precisely at the implementation level, it doesn't bite. Conversely, if it did bite, the priority would need to be documented.) It's just as if you ask for the intersection of two sets A and B; the answer does not depend on which set you look at first. Thus -in 1/10- refers to absolute observation numbers regardless of what values are in them, and -if foreign == 1- refers to values, regardless of what observations they are in. (Clearly, both sets could be null, even the first, if you had no observations, not to say their intersection could be null.) But that said, it's clear what you want. Flippantly put, you want -in 1/10- to mean the first 10 you care about (as otherwise specified) to occur in the data. Fair enough. As your examples show, one way to achieve that is to sort them to the front; then you can pick them off. A better way is to keep track of where the observations are, as is achieved by your use of -sum()-. Although the trickery with -sum()- is clever, I think I'd always find it faster to use -edit if foreign == 1- and look at the first whatever (or the last whatever). I once tried mimicking -in- in a program without using it directly, and found it trickier than I wanted, because of the need to support f, F, l, L and negative observation numbers. I forget the details but they were not as easy as I hoped. Nick n.j.cox@durham.ac.uk Dan Blanchette Have you ever wanted to list a selection of observations based on a condition but only list say a subset of 10 obs of that condition? If so, perhaps you've been frustrated with the fact that: . sysuse auto . list if foreign == 1 in 1/10 lists no observations because in the first 52 observations foreign == 0. The -in- subsets the data before the -if- condition subsets the data. This is the opposite in SAS: /* WHERE subsets the data before OBS subsets the data */ PROC PRINT DATA= SASHELP.SHOES(WHERE=(STORES < 10) OBS = 10); RUN; So, the above code lists the first 10 observations where (STORES < 10). I can't think of any situation where I would want to know how many times a certain condition exists in the first X observations. Do others ever need to know that? I figured out a solution where Stata will subset the data to the condition and then only list the range of observations I'm interested in: . list if sum((foreign == 1)) <= 10 The "(foreign == 1)" inside the sum() creates a value equal to 1 when the condition is true and then sum() creates a running sum of that. You can use the sum() function to subset your data for other Stata commands. You could get a range of observations as well: . list if inrange(sum((foreign == 1)),2,11) I may decide always to use this since: . list if sum((foreign == 1)) <= 100 will also work despite the fact there aren't 100 observations in the data. I'll never again get the error message: Obs. nos. out of range r(198); My previous solution was to: preserve keep if foreign == 1 local nobs = 10 if _N < `nobs' local nobs = _N list in 1/`nobs' restore * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: What's the added value of having -in- subset the data before -if- does?***From:*Dan Blanchette <dan.blanchette@duke.edu>

- Prev by Date:
**st: RE: 'sneop' applicable to panel data?** - Next by Date:
**st: RE: RE: 'sneop' applicable to panel data?** - Previous by thread:
**st: AW: What's the added value of having -in- subset the data before -if- does?** - Next by thread:
**st: AW: What's the added value of having -in- subset the data before -if- does?** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |