Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: -save- a varlist

From   daniel klein <>
Subject   Re: st: -save- a varlist
Date   Tue, 30 Oct 2012 15:24:19 +0100


note that you may -use- a subset of the dataset specifying only a few
variables (-help use-). This should be a lot faster than -use-ing the
whole dataset, backing it up with -preserve- several times and
-keep-ing only some variables, just to reload the entire dataset
(correct me if I am wrong here).

Instead, try looping throu the variable lists you want to keep in the
smaller files, and load only those variables. A code could look
something like

loc i 0
foreach vlist in "foo1-foo42" "bar1-bar42" {
    u `vlist'  using huge_file ,clear
    sa subset`++i'.dta

To get a list of all variables in the dataset without loading it, see
-help describe-.

I have a very large dataset (20gb) that I must access remotely, so that
-use- and -save- each take about 30minutes. When I have this file open,
I would like to create some secondary files that contain only 1-3 variable.

If the file were smaller, I would typically use -preserve-, -keep-,
-save-, -restore- to do this. However, this takes a couple of hours. So
I am using

. export excel var1 var2 var3 using file.xls, replace

Then later using -import excel- to read them back in. This is much
faster, but has the obvious drawback that variable labels and other
attributes are lost. It is also aesthetically unsatisfying.

Can anyone suggest an alternative?

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index