Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: precision: -insheet- & -outsheet-


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: precision: -insheet- & -outsheet-
Date   Tue, 4 Apr 2006 23:47:26 +0100

I am afraid that some of the answer is "Don't do 
this" (e.g. import from Excel) but no doubt that is not a
practical answer. 

Also, the reference to the "actual values" of the variables
is at best disingenous. Only if the values are integers, and 
in a few other cases, are such values completely unequivocal. 
Only just recently Bill Gould explained at considerable length
how in practice we are usually talking about the best binary 
approximations. 

More positively, there is some advice at
http://www.stata.com/support/faqs/data/newexcel.html

Although listed as author, I compiled that document in
self-defence as I was getting questions I couldn't answer 
from colleagues and students who use Excel. (Except for sets 
of measure zero, I don't.) Most of the real details came from the people
also named in the FAQ. 

Nick 
n.j.cox@durham.ac.uk 

clinton.thompson@summitllc.us

> I am using Stata S/E 9.1 for Macintosh.
> 
> This question concerns the issue of retaining data precision, 
> as it were,
> when -insheeting- data into Stata from a .csv file and, 
> conversely, when
> exporting data from Stata via -outsheet-.
> 
> Suppose I have an Excel file that contains large numbers that are
> presented as scientific notation.  Further suppose that I 
> save said Excel
> file as a comma-delimited file (.csv) then import into Stata via
> -insheet-, along w/ the double option.  After the insheet, an 
> examination
> of the data indicates that some degree of precision has been lost --
> rounding -- even after the variables have been reformatted to 
> reflect the
> length & desired format (e.g. %20.2fc).  One clumsy 
> workaround for this
> problem involves changing the numeric variable to string via 
> the insertion
> of commas into the values, -insheet-ing, then -destring-ing 
> the variable
> back into a numeric variable.  As noted, this seems like a terribly
> cumbersome and clumsy solution -- any suggested alternatives 
> (e.g. using
> an alternative to -insheet-)??
> 
> On the flip side, i.e. -outsheet-ing the data back into a 
> .csv file, is it
> necessary to format any numeric variables into non-scientific notation
> format so as to retain the actual values of the variables?
 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index