Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Partitioning a main file in several with a speficied number of variables

From   Nuno Soares <>
Subject   st: Partitioning a main file in several with a speficied number of variables
Date   Sun, 7 Apr 2013 21:32:16 +0100

Hi everyone,

I'm having a problem with some files that are taking a long time for
Stata to process. These are based on the import of csv files and have
about 5000 variables (labelled as v1 - v5000). Don't worry, these are
not actual values, but only the way a given database provides data
which I than have to treat in Stata.

The treatment procedure is working fine, but Stata as some problems
dealing with such a large number of variables. I noticed that, if I
only have about 1000 variables, it takes Stata about one hour to
process each file. However, if the 5000 variables are used, it just
hangs up or takes almost 12 hours to do the same stuff.

So, to speed up the process, the solution is to brake the main files
into files with 1000 variables (or less). The problem is that I don't
know how to write a code in Stata that does this. If the files had
always the 5000 variables, I would just drop/keep the variables as:

keep v1 v2-v1000

keep v1 v1001-v2000

and so on (v1 must always be kept)

The problem is for those files that have more/less than 5000
variables, which I cannot know without opening each file.

Does anyone know a way to automate this?

Best wishes,

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index