Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: A problem while dealing with massive amount of data

From   Nick Cox <>
Subject   Re: st: A problem while dealing with massive amount of data
Date   Tue, 28 Jun 2011 10:00:30 +0100

Regardless of operating system, -filefilter- could be used to
standardise on one or other variable name.


On Tue, Jun 28, 2011 at 9:45 AM, Neil Shephard <> wrote:
> On 28 June 2011 09:19, Mayank Mishra <> wrote:

>> I have around two thousand .csv file in a folder which I need to clean
>> and save as stata .dta file. For this I am running a loop in which
>> -insheet- command takes up a file, then it gets cleaned and saved.
>> There is a variable named "option_typ" which is used twice in the loop
>> while cleaning. The problem is, in some files this variable is named
>> as "optiontype". For those files, this do file gives an error and loop
>> stops as it cannot find a variable named "option_typ". What makes it
>> worse is that I don't know, which file have different variable name
>> than used in the do file. So, please tell me what I can do for this
>> situation.
> You don't state which operating system your working on, but if your on
> a *NIX based system you could easily use 'grep' to search all your
> files and tell you just which files match (using the '-l' switch) or
> those that don't match (using the '-L' switch), for example...
> $ grep -l 'option_typ' *.csv > files_matching_option_typ.txt
> $ grep -L 'option_typ' *.csv > files_not_matching_option_typ.txt
> ...will give you two files, whose names should be self-explanatory.
> You can then use these lists to loop over specific files appropriately
> depending on their contents.
> If you're not on a *NIX system you could achieve this under M$-Windows
> by installing the UNIX-like shell Cygwin (see
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index