Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: A problem while dealing with massive amount of data

From   Neil Shephard <>
Subject   Re: st: A problem while dealing with massive amount of data
Date   Tue, 28 Jun 2011 09:45:26 +0100

On 28 June 2011 09:19, Mayank Mishra <> wrote:
> Hello all,
> I have around two thousand .csv file in a folder which I need to clean
> and save as stata .dta file. For this I am running a loop in which
> -insheet- command takes up a file, then it gets cleaned and saved.
> There is a variable named "option_typ" which is used twice in the loop
> while cleaning. The problem is, in some files this variable is named
> as "optiontype". For those files, this do file gives an error and loop
> stops as it cannot find a variable named "option_typ". What makes it
> worse is that I don't know, which file have different variable name
> than used in the do file. So, please tell me what I can do for this
> situation.

You don't state which operating system your working on, but if your on
a *NIX based system you could easily use 'grep' to search all your
files and tell you just which files match (using the '-l' switch) or
those that don't match (using the '-L' switch), for example...

$ grep -l 'option_typ' *.csv > files_matching_option_typ.txt
$ grep -L 'option_typ' *.csv > files_not_matching_option_typ.txt

...will give you two files, whose names should be self-explanatory.
You can then use these lists to loop over specific files appropriately
depending on their contents.

If you're not on a *NIX system you could achieve this under M$-Windows
by installing the UNIX-like shell Cygwin (see


“Truth in science can be defined as the working hypothesis best suited
to open the way to the next better one.” - Konrad Lorenz

Email -
Website -
Photos -

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index