Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: insheet and dropping cases


From   "Ben Hoen" <bhoen@lbl.gov>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: insheet and dropping cases
Date   Thu, 20 Feb 2014 15:27:18 -0500

Thanks Phil!  By god, I believe you have solved the mystery!  

This was a recurring problem with the files a few years ago from the same
supplier and now we have them shipped using the vert bar.

Is there no way to sleuth out the occurrences of double quotes and change
them to single or something other?

I just tried: . filefilter IL.txt IL2.txt, from(") to() replace

And got this response:

unmatched quote
r(198);

Ben

Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Phil Schumm
Sent: Thursday, February 20, 2014 3:15 PM
To: Statalist Statalist
Subject: Re: st: insheet and dropping cases

On Feb 20, 2014, at 9:39 AM, Ben Hoen <bhoen@lbl.gov> wrote:
> Below is an example of four records that span such a scenario in that file
(without the non-printable characters)
> 
>
***************************************************************************
> The last line it reads 
>
***************************************************************************

<snip>

> N 89 DEG 46'47" E 1308.19 FT TO WATSON RD; N/LY ALONG THE CURVE OF WATSON
RD


It's the double-quote character here that is causing the problem; -insheet-
sees it, and continues reading over multiple lines until it finds a matching
close (double) quote or until EOF.  Prior to Stata 13, this would quickly
exhaust the length limit of a string variable, but in Stata 13, that is no
longer a limitation.

IIRC, -insheet- was very limited in how it dealt with double quotes.
Basically, it wanted to see them in pairs enclosing a single text value, but
couldn't handle them otherwise (those who recall differently, please correct
me if I'm wrong).

In Stata 13, -import delimited- is much more flexible WRT how double quotes
are handled, providing the -bindquotes()- and -stripquotes()- options (see
http://www.stata.com/manuals13/dimportdelimited.pdf for more information).
I just tried your example in Stata 13, and it works fine.

Thus, unless someone knows something undocumented about -insheet- that I
don't (which is certainly possible), I'm afraid your only option is to
upgrade (of course, there are a million other reasons to do so too).


-- Phil


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index