Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: why messy when importing a csv file?


From   Jessie Grace <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: why messy when importing a csv file?
Date   Thu, 6 May 2010 18:34:00 +0000

Sarah Edgington,
thank you for help.
If I open the file in notepad, the result is
 
"Stkcd,Accper,A001101000"
"000002,""1999-06-30"",468010960.13"
"000002,""2002-09-30"",1166858479.70"
"000002,""2000-01-01"",772831829.15"
"000002,""2000-06-30"",911966043.54"
"000002,""2000-12-31"",995745160.05"
"000002,""2009-03-31"",26921921879.80"
"000002,""1997-06-30"",0"
"000002,""1991-12-31"",88628783.34"
"000002,""1992-12-31"",204653478.04"
"000003,""1998-12-31"",120946052.36"
 
If I enter " type firms.csv " in Stata, the result is as follows, which is full of strange characters.

. type firms.csv
..".S.t.k.c.d.,.A.c.c.p.e.r.,.A.0.0.1.1.0.1.0.0.0.".
.
.".0.0.0.0.0.2.,.".".1.9.9.9.-.0.6.-.3.0.".".,.4.6.8.0.1.0.9.6.0...1.3.".
.
.".0.0.0.0.0.2.,.".".2.0.0.2.-.0.9.-.3.0.".".,.1.1.6.6.8.5.8.4.7.9...7.0.".
.
.".0.0.0.0.0.2.,.".".2.0.0.0.-.0.1.-.0.1.".".,.7.7.2.8.3.1.8.2.9...1.5.".
.
.".0.0.0.0.0.2.,.".".2.0.0.0.-.0.6.-.3.0.".".,.9.1.1.9.6.6.0.4.3...5.4.".
.
.".0.0.0.0.0.2.,.".".2.0.0.0.-.1.2.-.3.1.".".,.9.9.5.7.4.5.1.6.0...0.5.".
.
.".0.0.0.0.0.2.,.".".2.0.0.9.-.0.3.-.3.1.".".,.2.6.9.2.1.9.2.1.8.7.9...8.0.".
.
.".0.0.0.0.0.2.,.".".1.9.9.7.-.0.6.-.3.0.".".,.0.".
.
.".0.0.0.0.0.2.,.".".1.9.9.1.-.1.2.-.3.1.".".,.8.8.6.2.8.7.8.3...3.4.".
.
.".0.0.0.0.0.2.,.".".1.9.9.2.-.1.2.-.3.1.".".,.2.0.4.6.5.3.4.7.8...0.4.".
.
.".0.0.0.0.0.3.,.".".1.9.9.8.-.1.2.-.3.1.".".,.1.2.0.9.4.6.0.5.2...3.6.".
 
The problem seems to lie in that it is not plain text (ASCII) as -insheet- requires.
 
Thank you for all help.
 
Grace.
 
 

----------------------------------------
> From: [email protected]
> To: [email protected]
> Subject: RE: st: RE: why messy when importing a csv file?
> Date: Thu, 6 May 2010 11:04:37 -0700
>
> to save a new one. Depending on how big the data set is the solution of
> simply copying the contents of the file to the editor window and saving a
> stata dataset might be the easiest. Otherwise you need to make sure that
> you're saving a csv file that doesn't have extraneous information in it that
> Stata can't use.
>
> You say "The characteristic of the file is the contents of each row are in
> the same cell." What does this mean? Are you referring to the fact that
> the value of the first variable is repeated? If so, that isn't a problem.
> If you mean something else, particularly something having to do with the way
> the end of the line is treated in the file then you have a problem. Are you
> saying that if you open the csv file in a spreadsheet program you get all 25
> lines of data in a single row of the spreadsheet? If so, that's likely
> going to cause issues. What does the csv file look like in a really basic
> text editor (for example on a windows machine what does it look like if you
> open it in notepad, not wordpad or word, but notepad)? Or alternatively
> what do you get if you enter " type firms.csv " in Stata?
>
> -Sarah Edgington
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Jessie Grace
> Sent: Thursday, May 06, 2010 10:19 AM
> To: [email protected]
> Subject: RE: st: RE: why messy when importing a csv file?
>
> Nick, thank you for reply.
> Additionally, the csv file is downloaded from a certain database. If I copy
> the contents of the file to Stata's editor window. Everything goes well.
>
> . list
> +-------------------------------+
> | stkcd accper a00110~0 |
> |-------------------------------|
> 1. | 2 1999-06-30 4.68e+08 |
> 2. | 2 2002-09-30 1.17e+09 |
> 3. | 2 2000-01-01 7.73e+08 |
> 4. | 2 2000-06-30 9.12e+08 |
> 5. | 2 2000-12-31 9.96e+08 |
> |-------------------------------|
> 6. | 2 2009-03-31 2.69e+10 |
> 7. | 2 1997-06-30 0 |
> 8. | 2 1991-12-31 8.86e+07 |
> 9. | 2 1992-12-31 2.05e+08 |
> 10. | 3 1998-12-31 1.21e+08 |
> +-------------------------------+
>
> If I copy the contents to a new csv file and type "insheet using firms.csv",
> the results are as follows.
>
> . list
> +-------------------------+
> | v1 |
> |-------------------------|
> 1. | Stkcd,Accper,A001101000 |
> 2. | ,468010960.13 |
> 3. | ,1166858479.70 |
> 4. | ,772831829.15 |
> 5. | ,911966043.54 |
> |-------------------------|
> 6. | ,995745160.05 |
> 7. | ,26921921879.80 |
> 8. | ,0 |
> 9. | ,88628783.34 |
> 10. | ,204653478.04 |
> |-------------------------|
> 11. | ,120946052.36 |
> +-------------------------+
>
> I think the points are "the contents of each row are in the same cell" and
> the double quotes of the second variable in my csv file.
>
> Thank you for any help.
>
> Grace
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index