Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: why messy when importing a csv file?


From   "Sarah Edgington" <sedging@ucla.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: why messy when importing a csv file?
Date   Thu, 6 May 2010 11:04:37 -0700

to save a new one.  Depending on how big the data set is the solution of
simply copying the contents of the file to the editor window and saving a
stata dataset might be the easiest.  Otherwise you need to make sure that
you're saving a csv file that doesn't have extraneous information in it that
Stata can't use.  

You say "The characteristic of the file is the contents of each row are in
the same cell."  What does this mean?  Are you referring to the fact that
the value of the first variable is repeated?  If so, that isn't a problem.
If you mean something else, particularly something having to do with the way
the end of the line is treated in the file then you have a problem.  Are you
saying that if you open the csv file in a spreadsheet program you get all 25
lines of data in a single row of the spreadsheet?  If so, that's likely
going to cause issues.  What does the csv file look like in a really basic
text editor (for example on a windows machine what does it look like if you
open it in notepad, not wordpad or word, but notepad)?  Or alternatively
what do you get if you enter " type firms.csv " in Stata?

-Sarah Edgington

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jessie Grace
Sent: Thursday, May 06, 2010 10:19 AM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: RE: why messy when importing a csv file?

Nick, thank you for reply.
Additionally, the csv file is downloaded from a certain database. If I copy
the contents of the file to Stata's editor window. Everything goes well.
 
. list
     +-------------------------------+
     | stkcd       accper   a00110~0 |
     |-------------------------------|
  1. |     2   1999-06-30   4.68e+08 |
  2. |     2   2002-09-30   1.17e+09 |
  3. |     2   2000-01-01   7.73e+08 |
  4. |     2   2000-06-30   9.12e+08 |
  5. |     2   2000-12-31   9.96e+08 |
     |-------------------------------|
  6. |     2   2009-03-31   2.69e+10 |
  7. |     2   1997-06-30          0 |
  8. |     2   1991-12-31   8.86e+07 |
  9. |     2   1992-12-31   2.05e+08 |
 10. |     3   1998-12-31   1.21e+08 |
     +-------------------------------+

If I copy the contents to a new csv file and type "insheet using firms.csv",
the results are as follows.

. list
     +-------------------------+
     |                      v1 |
     |-------------------------|
  1. | Stkcd,Accper,A001101000 |
  2. |           ,468010960.13 |
  3. |          ,1166858479.70 |
  4. |           ,772831829.15 |
  5. |           ,911966043.54 |
     |-------------------------|
  6. |           ,995745160.05 |
  7. |         ,26921921879.80 |
  8. |                      ,0 |
  9. |            ,88628783.34 |
 10. |           ,204653478.04 |
     |-------------------------|
 11. |           ,120946052.36 |
     +-------------------------+

I think the points are "the contents of each row are in the same cell" and
the double quotes of the second variable in my csv file.
 
Thank you for any help.
 
Grace

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index