Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: insheet and dropping cases


From   "Ben Hoen" <bhoen@lbl.gov>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: insheet and dropping cases
Date   Thu, 20 Feb 2014 10:39:16 -0500

OK. I downloaded Notepad++ (free and seems to have a lot of functionality,
not a Stata program of course).

I can see the end of the line characters on both the last line read and the
line after the last line, and can't see anything suspicious, but frankly I
am not sure what I should be looking for.

I did the same procedure with a few files and normally a single record is
the last record read in the file (i.e., no records are read after it), but
one particular file (FL.txt) is unique in that records are read, and then
skipped and then reading continues.  

Below is an example of four records that span such a scenario in that file
(without the non-printable characters)

***************************************************************************
The last line it reads 
***************************************************************************
FL_FEO_1803|FL|Gadsden|324 WATSON
RD|QUNICY|32351|7138|020701|2006|12039|3092N4W0000002340100|001|3-09-2N-4W-0
000-00234-0100|3092N4W0000002340100||3-09-2N-4W-0000-00234-0100|1152320000||
||0207012006|||04W|02N|09||112908124
0|560|6000||||70||007|||||30586377|08464396Q||324||||WATSON|RD|||QUINCY|FL|3
23517138|R010||SCHELL|LLOYD G|SCHELL|LAURA
E|O|||Y|||||324||||WATSON|RD|||QUINCY|FL|323517138|R010||00000296822|0000016
5322|00000131500|M|M|M|00000158044|00000000000|00000000000|00000296822|00000
165322|00000131500|00000000000|00000000000|00000000000|00000197121|2013|2013
|007|19900101|12736|||000442001481|G|00000000|19960100|00000186000|||1|00000
||Y|00000000000|00000000||||00000|00000000|000000000|||00000000000||||||||00
000000|00000000|00000000000|||00000000000|0000000|0000000|0000000882900|0038
45912||000002716|B|000002716|0002232|0000000|0000000|0002450|0000000|0000000
|1970|1980|00003|00000|0000200|0000200|00002|00000|00000|00000|00000|ACE||RS
0||||BRI||000||||||FA0|00000||||QAV|161|||00100|||000||00000|||||OR 435 P
184, OR 489 P 715 BEGIN AT THE SWC OF THE NW 1/4 OF SECTION 9-2N-4W AND RUN
N 89 DEG 46'47" E 1308.19 FT TO WATSON RD; N/LY ALONG THE CURVE OF WATSON RD
FOR AN ARC SEE TAX ROLL FOR ADDITIONAL LEGAL||

***************************************************************************
Line after the last line it reads - meaning this line is not read nor are
~400 lines after it 
***************************************************************************
FL_FEO_1642|FL|Gadsden|64 E LAKE
LANE|QUINCY|32351|7334|020400|1036|12039|3122N6W0000001230600|001|3-12-2N-6W
-0000-00123-0600|3122N6W0000001230600||3-12-2N-6W-0000-00123-0600|1177120000
||||0204001036|||06W|02N|12||24531223
0|137|0200||Y||10||007|||||30590978|08479148P||64|||E|LAKE|LN|||QUINCY|FL|32
3517334|R001||NUNAMAKER|DOUGLAS
W|NUNAMAKER|THERESA|O|||Y|||||64|||E|LAKE|LN|||QUINCY|FL|323517334|R001||000
00053289|00000038960|00000014329|M|M|M|00000040686|00000000000|00000000000|0
0000053289|00000038960|00000014329|00000000000|00000000000|00000000000|00000
025878|2013|2013|007|19900101|14812|||000336001518|G|00000000|19870300|00000
013800|||1|00000||N|00000000000|00000000||||00000|00000000|000000000|||00000
000000||||||||00000000|00000000|00000000000|||00000000000|0000000|0000000|00
00000097400|000424274||000001848|L|000002525|0001848|0000000|0000000|0002098
|0000000|0000000|1974|1974|00005|00000|0000300|0000300|00003|00000|00000|000
00|00000|ACE||RM0||||PWP||000||||||FA0|00000||||QAV|001|||00100|||000||00000
|||||OR 372 P 1323 COMM AT SEC OF W1/2 OF NE1/4, N 00 DEG 50 MIN 54 SEC W
431. 78 FT, N 47 DEG 30 MIN W 522.2 6 FT, N/LY ALONG CURVE FOR AN ARC DIST
OF 598.28 FT, N 32 SEE TAX ROLL FOR ADDITIONAL LEGAL||

***************************************************************************
Line before the first line it starts reading again - meaning this line is
not read but lines after it are read 
***************************************************************************
FL_FEO_1390|FL|Pasco|13730 PLAINVIEW
RD|ODESSA|33556|4043|031602|1007|12101|1726340000004000160|001|17-26-34-000.
0-004.00-016.0|||17-26-34-000.0-004.00-016.0|||38|P-19|0316021007|||17E|26S|
34||11921964
0|163|01|||R1|10||||7|6||28173897|08258795P||13730||||PLAINVIEW|RD|||ODESSA|
FL|335564043|R005||PIKE|TIMOTHY B|PIKE|BARBARA
K|O|||Y|||||13730||||PLAINVIEW|RD|||ODESSA|FL|335564043|R005|Y|00000184062|0
0000056475|00000127587|M|M|M|00000139420|00000000000|00000000000|00000184062
|00000056475|00000127587|00000000000|00000000000|00000000000|00000177008|201
3|2013|9100|||||||00000000|00000000|00000000000||||00000|||00000000000|00000
000||||00000|00000000|000000000|||00000000000||||||||00000000|00000000|00000
000000|||00000000000|0000000|0000000|0000000012200|000053143|0110|000002104|
L|000003232|0002104|0002104|0000000|0002520|0000000|0000528|1964|1970|00005|
00000|0000200|0000200|00002|00000|00000|00000|00000|ACE||||||001|Y|001|001||
A00||450|FA0|00000|450|||QVV|155|H00||00100|||000||00000|||||COM AT SW COR
OF SE1/4 OF SE1/4 OF SEC TH ALG SOUTH LN OF SEC WEST 50.35 FT FOR POB TH
N00DG 17' 30"W 294.70 FT TH N88DG 03' 00"E 176.70 FT TH S01DG 13' 00"E
300.71 FT TH ALG SOUTH LN OF SEC WEST 176.35 FT TO POB SUBJECT TO EASEMENT
FOR INGRESS & EGR|ESS OVER & ACROSS NORTH 15 FT THEREOF OR 1836 PG 689|

***************************************************************************
This line it starts to read again.  (it skips a few more times in the file,
but I am only showing this occurance)
***************************************************************************
FL_FEO_1321|FL|Pasco|17131 WISHINGWELL
LN|SPRINGHILL|34610|4049|031809|1022|12101|1824200000004000090|001|18-24-20-
000.0-004.00-009.0|20    24   18 000000400
0090||18-24-20-000.0-004.00-009.0|||8|T-5|0318091022|||18E|24S|20||59469521
0|163|01|||AR|10||||7|6||28379241|08253216K||17131||||WISHINGWELL|LN|||SPRIN
G HILL|FL|346104049|R144||TAYLOR|MICHAEL
A|||O|||Y|||MM||17131||||WISHINGWELL|LN|||SPRING
HILL|FL|346104049|R144||00000448514|00000057285|00000391229|M|M|M|0000044851
4|00000000000|00000000000|00000448514|00000057285|00000391229|00000000000|00
000000000|00000000000|00000696748|2013|2013|4200|20040127|00097||00000000908
5|005696001300|G|20040116|20040116|00000057000||BROWN PASCO P & SANDRA
S|1|04220|CAPSTONE TITLE SVCS
LLC|Y|00000000000|00000000||||00000|00000000|000000000|||00000000000|||19100
103|21901||000492000619|G|00000000|19690000|00000000000|||00000000000|000000
0|0000000|0000000050000|000217800|0110|000005363|L|000008245|0005363|0005363
|0000000|0006365|0000000|0001426|2009|2009|00000|00000|0000400|0000400|00003
|00001|00000|00000|00000|ACE||||||CBS||000|||||450|FA0|00000|450|Y|001|QAV|0
15|H00||00100|||000||00000|||||W1/2 OF SE1/4 OF NW1/4 OF SW1/4:SUBJ TO ROAD
R/W OVER SOUTH 20 FT OR 5696 PG 1300-1301||
The only striking thing I notice is that the line it does not read (#186) is
3015 characters.  Is there a line limit in insheet?  Say 3000 characters?
Although there ARE other lines that are more than 3000 characters in the
file, this one is the first of that size.

Any ideas?

I am about ready to punt, as I have a workable workaround, but it WOULD be
nice to know what is happening.  

As always, thanks in advance.

Ben

Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Phil Schumm
Sent: Thursday, February 20, 2014 9:38 AM
To: Statalist Statalist
Subject: Re: st: insheet and dropping cases

On Feb 20, 2014, at 8:28 AM, Ben Hoen <bhoen@lbl.gov> wrote:
> Hexdump I had never used.  This is what it returned:

<snip>

> Do you see anything suspicious here?  (I replaced all the commas with "_",
using filefilter - another great suggestion -  wondering if that was causing
any issues and insheet still returned 184 observations.)


I don't see anything obvious -- you'll need to look at the file directly.
Is Stata reading the first 184 observations, or are the 184 observations
from different places in the file?  Check that first, and if you are getting
the first 184 observations, then look at lines 184-6 (depending on whether
the file has a header line).  Something has to be going on there.


-- Phil


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index