I would like to thank those who responded to my original posting with
suggestions for reading data with multiple line formats. My slowness
in replying does not reflect lack of appreciation but the exigencies
of other business.
I have experimented with the various suggestions and have eventually
returned to the option reading each line into a string variable and
extracting the variables of interest from the string via sub-string
operations. I described this as "clumsy", though "inelegant" might
have been a better word. On the other hand, I found that a brute
force approach gave me a better understanding what I was doing so
that it was easier to correct mistakes. Further, this method is
pretty much what Stata would do if it permitted conditional infile
statements - though processing the dataset line by line rather
reading the full dataset and then processing each line. The one
disadvantage is that the initial read step may require a lot of
memory if the maximum line length is large (my data has ~50 million
records), so I had to break up the data into manageable chunks.
Finally, may I offer best wishes for a happy Thanksgiving holiday to
both StataCorp and other readers in the US.