Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Sarah Edgington" <sedging@ucla.edu> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: RE: why messy when importing a csv file? |
Date | Thu, 6 May 2010 11:04:37 -0700 |
to save a new one. Depending on how big the data set is the solution of simply copying the contents of the file to the editor window and saving a stata dataset might be the easiest. Otherwise you need to make sure that you're saving a csv file that doesn't have extraneous information in it that Stata can't use. You say "The characteristic of the file is the contents of each row are in the same cell." What does this mean? Are you referring to the fact that the value of the first variable is repeated? If so, that isn't a problem. If you mean something else, particularly something having to do with the way the end of the line is treated in the file then you have a problem. Are you saying that if you open the csv file in a spreadsheet program you get all 25 lines of data in a single row of the spreadsheet? If so, that's likely going to cause issues. What does the csv file look like in a really basic text editor (for example on a windows machine what does it look like if you open it in notepad, not wordpad or word, but notepad)? Or alternatively what do you get if you enter " type firms.csv " in Stata? -Sarah Edgington -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jessie Grace Sent: Thursday, May 06, 2010 10:19 AM To: statalist@hsphsun2.harvard.edu Subject: RE: st: RE: why messy when importing a csv file? Nick, thank you for reply. Additionally, the csv file is downloaded from a certain database. If I copy the contents of the file to Stata's editor window. Everything goes well. . list +-------------------------------+ | stkcd accper a00110~0 | |-------------------------------| 1. | 2 1999-06-30 4.68e+08 | 2. | 2 2002-09-30 1.17e+09 | 3. | 2 2000-01-01 7.73e+08 | 4. | 2 2000-06-30 9.12e+08 | 5. | 2 2000-12-31 9.96e+08 | |-------------------------------| 6. | 2 2009-03-31 2.69e+10 | 7. | 2 1997-06-30 0 | 8. | 2 1991-12-31 8.86e+07 | 9. | 2 1992-12-31 2.05e+08 | 10. | 3 1998-12-31 1.21e+08 | +-------------------------------+ If I copy the contents to a new csv file and type "insheet using firms.csv", the results are as follows. . list +-------------------------+ | v1 | |-------------------------| 1. | Stkcd,Accper,A001101000 | 2. | ,468010960.13 | 3. | ,1166858479.70 | 4. | ,772831829.15 | 5. | ,911966043.54 | |-------------------------| 6. | ,995745160.05 | 7. | ,26921921879.80 | 8. | ,0 | 9. | ,88628783.34 | 10. | ,204653478.04 | |-------------------------| 11. | ,120946052.36 | +-------------------------+ I think the points are "the contents of each row are in the same cell" and the double quotes of the second variable in my csv file. Thank you for any help. Grace * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/