Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Conditional infile statements,


From   Gordon Hughes <G.A.Hughes@ed.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Conditional infile statements,
Date   Thu, 24 Nov 2011 10:10:39 +0000

I would like to thank those who responded to my original posting with suggestions for reading data with multiple line formats. My slowness in replying does not reflect lack of appreciation but the exigencies of other business.

I have experimented with the various suggestions and have eventually returned to the option reading each line into a string variable and extracting the variables of interest from the string via sub-string operations. I described this as "clumsy", though "inelegant" might have been a better word. On the other hand, I found that a brute force approach gave me a better understanding what I was doing so that it was easier to correct mistakes. Further, this method is pretty much what Stata would do if it permitted conditional infile statements - though processing the dataset line by line rather reading the full dataset and then processing each line. The one disadvantage is that the initial read step may require a lot of memory if the maximum line length is large (my data has ~50 million records), so I had to break up the data into manageable chunks.

Finally, may I offer best wishes for a happy Thanksgiving holiday to both StataCorp and other readers in the US.

Gordon Hughes
g.a.hughes@ed.ac.uk
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index