Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: -save- a varlist

From   daniel klein <>
Subject   Re: st: -save- a varlist
Date   Tue, 30 Oct 2012 15:24:19 +0100


note that you may -use- a subset of the dataset specifying only a few
variables (-help use-). This should be a lot faster than -use-ing the
whole dataset, backing it up with -preserve- several times and
-keep-ing only some variables, just to reload the entire dataset
(correct me if I am wrong here).

Instead, try looping throu the variable lists you want to keep in the
smaller files, and load only those variables. A code could look
something like

loc i 0
foreach vlist in "foo1-foo42" "bar1-bar42" {
    u `vlist'  using huge_file ,clear
    sa subset`++i'.dta

To get a list of all variables in the dataset without loading it, see
-help describe-.

I have a very large dataset (20gb) that I must access remotely, so that
-use- and -save- each take about 30minutes. When I have this file open,
I would like to create some secondary files that contain only 1-3 variable.

If the file were smaller, I would typically use -preserve-, -keep-,
-save-, -restore- to do this. However, this takes a couple of hours. So
I am using

. export excel var1 var2 var3 using file.xls, replace

Then later using -import excel- to read them back in. This is much
faster, but has the obvious drawback that variable labels and other
attributes are lost. It is also aesthetically unsatisfying.

Can anyone suggest an alternative?

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index