|  |  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: processing time
At 12:55 PM 3/22/2007, Jon Schwabish wrote:
Which is more efficient (in terms of processing time)?
drop if a==.
drop if b==.
  OR
drop if a==. | b==.
I would think that the latter is more efficient, especially with 
large datasets. You incur the cost of parsing and executing a command 
once, rather than twice (though the expression is more complex, but I 
don't suppose that matters much). Furthermore, the latter may be 
especially more efficient if there are many cases with b==. that do 
not have a==. .  The reason is that when you drop observations, there 
is, I suppose, a moving of records to close up the holes. With the 
two-command method, some records will be moved twice, rather than once.
I suppose it makes little difference for small datasets.
You can also -set rmsg on-, and run some experiments.
Finally, be aware that a==. is not the general way to test for 
missing value; that will test for equality with one specific missing 
value.  The way to test for missing values in general is mi(a) or 
a>=. . The method of mi(a) is even more general in that it works for 
string types as well.
HTH
--David
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/