[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: New package -ifwins- "if wins!" on SSC to subset data by "if" exp first then "in" range |

Date |
Wed, 18 Mar 2009 18:40:25 -0000 |

Dan was kind enough to let me see an earlier version of this privately. I encouraged him to post to get feedback. I also expressed some scepticism about his project. I don't want to deny that occasionally -if- and -in- don't work in the way some users want, but I don't see the solution as trying to subvert the way they work. Dan's problem is that the behaviour of -if- and -in- are so deep down that no user can do more than write a wrapper like this to change their behaviour temporarily -- and a very good thing too. Just imagine the extraordinary threads likely if the behaviour of -if- or -in- was tuneable. The Catch-22 of -ifwins- is this. With some effort, Dan can get some commands to behave the way he wants them, and he is careful that this never changes (including messes up) your dataset. But of course the rest of Stata, including anything that might change your data, is unaffected. (Otherwise put, any changes you make under -ifwins- are not permanent.) Positively, this gives some flexibility to those who want it. Negatively, two ways that -if- and -in- behave is one more than some people will want, especially if they have to keep changing their view. (I'd want to keep all learners under my wing firmly ignorant of -ifwins-.) As has often been pointed out, intellectual skill grows according to what you can do without thinking much about it. With some experience -if- and -in- just become intuitive so that you are rarely really surprised by what they do. The experienced Stata user just knows that . list if foreign == 1 in 1/10 does not necessarily mean "show me the first 10 foreign cars" -- or, at worst, if they temporarily forget, the same experienced Stata user can quickly think of several ways to get that output. In fact, although Dan does not mention it . browse if foreign == 1 is a pretty direct solution to his leading example problem. It does not have exactly the same consequence, but you can look at what you want and then close the window. In fact, to many users that way of working is likely to seem much more intuitive than learning -ifwins-. A price of any language is that even some simple things may take a few lines -- the only way to avoid this is a language with thousands and thousands of commands that would be unattractive and unlearnable. Even StataCorp has learnt this the hard way. Long-time users will remember the old -for- from a few versions back. In essence, -forvalues- and -foreach-, although they typically imply longer code than did -for-, give users the flexibility they really need without the extraordinary bugs and misunderstandings that bit -for- users. (Note to those who joined in Stata 9 or 10: this -for- was nothing to do with Mata's -for-, and not like it.) By the way, I think it does no harm to think that in Stata -in- subsets the dataset before -if-, but it's wrong in principle. -if- and -in- are orthogonal. What you get with both is an intersection of sets, possibly empty, and as with intersections there is no sense in which the intersection of A and B is _in principle_ a matter of identifying one set, say A, before another, say B. This is a fine distinction, but I think it's the correct one. Nick n.j.cox@durham.ac.uk Dan Blanchette Thanks to Kit Baum, a new package -ifwins- is now available for download on SSC. Description -ifwins- is a prefix command that runs most any Stata command that does not modify the dataset in memory (e.g. generate, replace, etc.). -ifwins- will have "if" subset the dataset before "in" subsets the dataset. This is the opposite of what happens when both "if" and "in" are used in the same Stata command. For example, the following code will first subset the dataset to the first 10 observations and then subset the dataset to the specified condition: . sysuse auto . list if foreign == 1 in 1/10 Since the auto.dta dataset is sorted by the variable foreign, the above code will not list any observations because in the first 10 observations foreign == 0 . So, "if" looses and "in" wins when "battling" over which one subsets the dataset before the other one does. If you want to run a Stata command on a certain number of observations when a certain condition exists, you would have to: . preserve . keep if foreign == 1 . list in 1/10 make turn weight . restore or use -ifwins- as a prefix to the desired Stata command: . ifwins if foreign == 1 in 1/10 : list make turn weight The above will list the first 10 observations of when the variable foreign is equal to 1 (one). So now "if" wins!...but "in" is still helpful. To install -ifwins-: . ssc install ifwins Let me know if you have any questions. Thanks to Roy Wada for helping me better document -ifwins-. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: New package -ifwins- "if wins!" on SSC to subset data by "if" exp first then "in" range***From:*Dan Blanchette <dan.blanchette@duke.edu>

- Prev by Date:
**st: Ordered Logit - Interact cut points** - Next by Date:
**st: dialogue box don't do anything?** - Previous by thread:
**st: New package -ifwins- "if wins!" on SSC to subset data by "if" exp first then "in" range** - Next by thread:
**st: Does Stata have an "exact likelihood approach " to estimate the variance of a proportion?** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |