Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: dropping even observations


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: dropping even observations
Date   Fri, 6 Oct 2006 14:03:35 +0100

Interesting question. 

Stata does process in observation order, but 
in effect _n is a pseudovariable 
that is not re-calculated until the -drop- 
is completed. Otherwise the advice given would 
have been incorrect. (I did emphasise that no new 
variable was needed.) 

An even simpler example is 

drop if _n == 24 | _n == 42 

If _n were re-evaluated once 24 has been 
dropped, then the observation -drop-ped 
would not be the one you want to keep. 

Although there would be an understandable logic
to what you fear, here Stata behaves as you would 
hope. 

Nick 
[email protected] 

Jeph Herrin
 
> One related point, not readily apparent to even regular users,
> is whether in
> 
>   drop if [exp]
> 
> expression [exp] is evaluated sequentially for each obs, or
> all at once as a vector. My first thought in reading the
> question was that the user was asking whether
> 
>   drop if !mod(_n,2)
> 
> would reset _n after each dropped observation; that is, after
> dropping _n==2, is the next observation going to be _n==3 or
> _n==2?
> 
> Obviously, it's not a big deal to
> 
>   gen byte even = !mod(_n,2)
>   drop if even
> 
> but it's still worth knowing whether one needs the extra step.

Nick Cox wrote:

> > The following question and reply arose privately. 
> > I have been asked this various times before, so the 
> > discussion should be of some wider interest. 
> > 
> > Nick
> > [email protected] 
> > 
> > Question: Do you have an expression or do file I could 
> > use or adapt to drop even rows (_n = 2,4,6 etc) or 
> > dropping every other row?  
> > 
> > Reply: 
> > 
> > Sure. The remainder on dividing integers 
> > by 2 is either 1 or 0 depending on whether 
> > those integers are odd or even. In Stata 
> > with observation numbers _n this remainder
> > is simply 
> > 
> > mod(_n,2) 
> > 
> > Logical negation ! flips 0 and 1 the 
> > other way round. 
> > 
> > Thus try 
> > 
> > sysuse auto, clear 
> > list mpg if mod(_n,2) 
> > list mpg if !mod(_n,2)
> > 
> > and so forth. 
> > 
> > Note that 
> > 
> > 1. you do not need to create any extra 
> > variables. 
> > 
> > 2. the technique generalises easily
> > to related problems: e.g. every 5th 
> > observation is selected by 
> > 
> > if !mod(_n,5) 
> > 
> > Your query adds support to my longstanding
> > view that there are useful functions that
> > people persistently overlook, although 
> > their usefulness can be blindingly obvious
> > once pointed out. Sooner or later I will
> > write a Stata Journal Tip on -mod()-.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index