Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Does Blasnik's Law apply to -use-?


From   "Michael Blasnik" <michael.blasnik@verizon.net>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Does Blasnik's Law apply to -use-?
Date   Thu, 13 Sep 2007 12:51:41 -0400

These results are different than mine and do not directly address the question. You compare opening the entire file vs. opening a part of the file using -in-. But the goal is to select only a subset of observations. For that, you would need a second command after opening the entire file or you would need to use the -use if _n>xxx & _n<yyy- construct. I find that using the -if- approach takes more time than using -in- or simply opening the file. By the way, you can more accurately test the timing of individual commands using -set rmsg on- rather than simply displaying the time

M Blasnik

----- Original Message ----- From: "David Elliott" <dcelliott@gmail.com>
To: <statalist@hsphsun2.harvard.edu>
Sent: Thursday, September 13, 2007 12:28 PM
Subject: Re: st: Does Blasnik's Law apply to -use-?



I was alerted offlist by a member that the mailer had truncated my
previous reply in this thread - here it is again:

Having used -parmby- recently and having some understanding of what
Roger is discussing, I'd like to offer the following.

From my interpretation of how Stata stores data, the ability to -use
in ##/##- would require the record indexes to be created by completely
loading the data.  I am currently working on a 4 million record
dataset and was able to run a quick test with a little program:

n di "Begin: " _n c(current_date) " " c(current_time) _n
use dss_data_05_06 in 1/1000, clear
n di "Load using in 1/1000" _n c(current_date) " " c(current_time) _n
use dss_data_05_06, clear
n di "Ordinary load" _n c(current_date) " " c(current_time)

Output:

Begin:
12 Sep 2007 15:02:46

Load using in 1/1000
12 Sep 2007 15:02:56

Ordinary load
12 Sep 2007 15:03:06

I switched the loading order and regardless, the load took 10 seconds
either way.  I don't think you can use this optimization.

DC Elliott
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index