Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Nick Cox" <n.j.cox@durham.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: RE: why messy when importing a csv file? |
Date | Thu, 6 May 2010 18:26:02 +0100 |
We have now two stories about what happens when you type -insheet using firms.csv-, the one here and that sent 15 minutes earlier at <http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist .1005/date/article-297.html> So, are you saying that your results are not even consistent? Nick n.j.cox@durham.ac.uk Jessie Grace Nick, thank you for reply. Additionally, the csv file is downloaded from a certain database. If I copy the contents of the file to Stata's editor window. Everything goes well. . list +-------------------------------+ | stkcd accper a00110~0 | |-------------------------------| 1. | 2 1999-06-30 4.68e+08 | 2. | 2 2002-09-30 1.17e+09 | 3. | 2 2000-01-01 7.73e+08 | 4. | 2 2000-06-30 9.12e+08 | 5. | 2 2000-12-31 9.96e+08 | |-------------------------------| 6. | 2 2009-03-31 2.69e+10 | 7. | 2 1997-06-30 0 | 8. | 2 1991-12-31 8.86e+07 | 9. | 2 1992-12-31 2.05e+08 | 10. | 3 1998-12-31 1.21e+08 | +-------------------------------+ If I copy the contents to a new csv file and type "insheet using firms.csv", the results are as follows. . list +-------------------------+ | v1 | |-------------------------| 1. | Stkcd,Accper,A001101000 | 2. | ,468010960.13 | 3. | ,1166858479.70 | 4. | ,772831829.15 | 5. | ,911966043.54 | |-------------------------| 6. | ,995745160.05 | 7. | ,26921921879.80 | 8. | ,0 | 9. | ,88628783.34 | 10. | ,204653478.04 | |-------------------------| 11. | ,120946052.36 | +-------------------------+ I think the points are "the contents of each row are in the same cell" and the double quotes of the second variable in my csv file. > From: n.j.cox@durham.ac.uk > No definition of "messy" here. > > My guess: By default your third variable will be -float- type and will > be assigned a format %8.0g. That wouldn't look exactly like the original > without resetting the format. > > Otherwise put: without specifying more information, you are in effect > _asking_ for Stata's default treatment in terms of storage types and > formats. So, the results shouldn't seem surprising. Jessie Grace > I have a .csv file, which consists of the following. > > Stkcd,Accper,A001101000 > 000002,"1999-06-30",468010960.13 > 000002,"2002-09-30",1166858479.70 > 000002,"2000-01-01",772831829.15 > 000002,"2000-06-30",911966043.54 > 000002,"2000-12-31",995745160.05 > 000002,"2009-03-31",26921921879.80 > 000002,"1997-06-30",0 > 000002,"1991-12-31",88628783.34 > 000002,"1992-12-31",204653478.04 > 000003,"1998-12-31",120946052.36 > > The first row contains variables names. The characteristic of the file > is the contents of each row are in the same cell. > No matter I typed "insheet using firms.csv" or "insheet using > firms.csv,comma", the importing results are messy. > Could anyone tell me why and how to solve? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/