Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Partitioning a main file in several with a speficied number of variables |
Date | Sun, 7 Apr 2013 23:35:26 +0100 |
You can get a varlist from outside a dataset by using -describe using-. Note particularly its -varlist- option. Nick njcoxstata@gmail.com On 7 April 2013 21:32, Nuno Soares <liststata@gmail.com> wrote: > Hi everyone, > > I'm having a problem with some files that are taking a long time for > Stata to process. These are based on the import of csv files and have > about 5000 variables (labelled as v1 - v5000). Don't worry, these are > not actual values, but only the way a given database provides data > which I than have to treat in Stata. > > The treatment procedure is working fine, but Stata as some problems > dealing with such a large number of variables. I noticed that, if I > only have about 1000 variables, it takes Stata about one hour to > process each file. However, if the 5000 variables are used, it just > hangs up or takes almost 12 hours to do the same stuff. > > So, to speed up the process, the solution is to brake the main files > into files with 1000 variables (or less). The problem is that I don't > know how to write a code in Stata that does this. If the files had > always the 5000 variables, I would just drop/keep the variables as: > > preserve > keep v1 v2-v1000 > save > restore > > preserve > keep v1 v1001-v2000 > save > restore > > and so on (v1 must always be kept) > > The problem is for those files that have more/less than 5000 > variables, which I cannot know without opening each file. > > Does anyone know a way to automate this? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/