Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problems reading in comma separated files

From	Orian Brook <[email protected]>
To	<[email protected]>
Subject	st: Problems reading in comma separated files
Date	Fri, 10 Jun 2011 15:01:26 +0100

Hi there

I have over 100 comma-separated data files which I need to amalgamate,
however, I am having a problem reading them in Stata and Stata Transfer.

Using Stata Transfer, the file (I?m using one as an example, but they all
have the same problem) the correct columns are read in, but just one row
(the column headings)

In Stata 10/SE using insheet, the file is correctly identified as having 20
variables, but the column headings are not read, and if you browse the data
it shows all cells as blank except the first, which reads ?ÿþt?. Moreover it
identifies c23K rows, where in fact the file has c11K The hexdump is below:

  Line-end characters                        Line length (tab=1)
    \r\n         (Windows)              0      minimum
1
    \r by itself (Mac)             11,741      maximum
405
    \n by itself (Unix)            11,741
  Space/separator characters                 Number of lines
23,483
    [blank]                        58,688      EOL at EOF?
no
    [tab]                               0
    [comma] (,)                   223,079    Length of first 5 lines
  Control characters                           Line 1
405
    binary 0                    1,317,072      Line 2
1
    CTL excl. \r, \n, \t                0      Line 3
240
    DEL                                 0      Line 4
1
    Extended (128-159,255)              1      Line 5
240
  ASCII printable
    A-Z                           118,268
    a-z                            98,746    File format
BINARY
    0-9                           724,323
    Special (!@#$ etc.)            70,486
    Extended (160-254)                  1
                          ---------------
  Total                         2,634,146

  Observed were:
     \0 \n \r blank , - . / 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N
O
     P Q R S T U V W X Y Z _ a b c d e f g h i k l m n o p r s t u v x y þ
     255

It seems to have the right number of \r, I don?t know why it is finding
double that number of rows. I suspect also that the extended character is
causing problems. Any ideas as to how to solve the problem? My only other
solution is to read all the files into Access, which doesn?t have a problem
reading them.

Thanks for any help

Orian 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Problems reading in comma separated files
  - From: Ronan Conroy <[email protected]>

Prev by Date: Re: st: RE: creating combined correlation of dummy (ordered multilevel)
Next by Date: Re: st: Formatting Dates
Previous by thread: st: Autocorrelation(testparm or wntstmvq?)
Next by thread: Re: st: Problems reading in comma separated files
Index(es):
- Date
- Thread