Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problems reading in comma separated files


From   Orian Brook <ob11@st-andrews.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Problems reading in comma separated files
Date   Fri, 10 Jun 2011 15:01:26 +0100

Hi there

I have over 100 comma-separated data files which I need to amalgamate,
however, I am having a problem reading them in Stata and Stata Transfer.

Using Stata Transfer, the file (I?m using one as an example, but they all
have the same problem) the correct columns are read in, but just one row
(the column headings)

In Stata 10/SE using insheet, the file is correctly identified as having 20
variables, but the column headings are not read, and if you browse the data
it shows all cells as blank except the first, which reads ?ÿþt?. Moreover it
identifies c23K rows, where in fact the file has c11K The hexdump is below:

  Line-end characters                        Line length (tab=1)
    \r\n         (Windows)              0      minimum
1
    \r by itself (Mac)             11,741      maximum
405
    \n by itself (Unix)            11,741
  Space/separator characters                 Number of lines
23,483
    [blank]                        58,688      EOL at EOF?
no
    [tab]                               0
    [comma] (,)                   223,079    Length of first 5 lines
  Control characters                           Line 1
405
    binary 0                    1,317,072      Line 2
1
    CTL excl. \r, \n, \t                0      Line 3
240
    DEL                                 0      Line 4
1
    Extended (128-159,255)              1      Line 5
240
  ASCII printable
    A-Z                           118,268
    a-z                            98,746    File format
BINARY
    0-9                           724,323
    Special (!@#$ etc.)            70,486
    Extended (160-254)                  1
                          ---------------
  Total                         2,634,146

  Observed were:
     \0 \n \r blank , - . / 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N
O
     P Q R S T U V W X Y Z _ a b c d e f g h i k l m n o p r s t u v x y þ
     255

It seems to have the right number of \r, I don?t know why it is finding
double that number of rows. I suspect also that the extended character is
causing problems. Any ideas as to how to solve the problem? My only other
solution is to read all the files into Access, which doesn?t have a problem
reading them.

Thanks for any help

Orian 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index