Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Conditional infile statements,

From	Gordon Hughes <[email protected]>
To	[email protected]
Subject	Re: st: Conditional infile statements,
Date	Thu, 24 Nov 2011 10:10:39 +0000

I would like to thank those who responded to my original posting withsuggestions for reading data with multiple line formats. My slownessin replying does not reflect lack of appreciation but the exigenciesof other business.

I have experimented with the various suggestions and have eventuallyreturned to the option reading each line into a string variable andextracting the variables of interest from the string via sub-stringoperations. I described this as "clumsy", though "inelegant" mighthave been a better word. On the other hand, I found that a bruteforce approach gave me a better understanding what I was doing sothat it was easier to correct mistakes. Further, this method ispretty much what Stata would do if it permitted conditional infilestatements - though processing the dataset line by line ratherreading the full dataset and then processing each line. The onedisadvantage is that the initial read step may require a lot ofmemory if the maximum line length is large (my data has ~50 millionrecords), so I had to break up the data into manageable chunks.

Finally, may I offer best wishes for a happy Thanksgiving holiday toboth StataCorp and other readers in the US.


Gordon Hughes

[email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Cleaning Strings and Regular Expressions
Next by Date: st: exact row in which error occurs
Previous by thread: st: Poisson residuals
Next by thread: st: manipulate built-in stata commands
Index(es):
- Date
- Thread