Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: A problem while dealing with massive amount of data


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: A problem while dealing with massive amount of data
Date   Tue, 28 Jun 2011 10:00:30 +0100

Regardless of operating system, -filefilter- could be used to
standardise on one or other variable name.

Nick

On Tue, Jun 28, 2011 at 9:45 AM, Neil Shephard <nshephard@gmail.com> wrote:
> On 28 June 2011 09:19, Mayank Mishra <mayankm16@gmail.com> wrote:

>> I have around two thousand .csv file in a folder which I need to clean
>> and save as stata .dta file. For this I am running a loop in which
>> -insheet- command takes up a file, then it gets cleaned and saved.
>> There is a variable named "option_typ" which is used twice in the loop
>> while cleaning. The problem is, in some files this variable is named
>> as "optiontype". For those files, this do file gives an error and loop
>> stops as it cannot find a variable named "option_typ". What makes it
>> worse is that I don't know, which file have different variable name
>> than used in the do file. So, please tell me what I can do for this
>> situation.
>
> You don't state which operating system your working on, but if your on
> a *NIX based system you could easily use 'grep' to search all your
> files and tell you just which files match (using the '-l' switch) or
> those that don't match (using the '-L' switch), for example...
>
> $ grep -l 'option_typ' *.csv > files_matching_option_typ.txt
> $ grep -L 'option_typ' *.csv > files_not_matching_option_typ.txt
>
> ...will give you two files, whose names should be self-explanatory.
> You can then use these lists to loop over specific files appropriately
> depending on their contents.
>
> If you're not on a *NIX system you could achieve this under M$-Windows
> by installing the UNIX-like shell Cygwin (see http://x.cygwin.com/).
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index