Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Precision in outsheet and outfile


From   "Schaffer, Mark E" <[email protected]>
To   <[email protected]>
Subject   st: Precision in outsheet and outfile
Date   Tue, 11 Jan 2011 13:39:34 -0000

Hi all.  I think I've just been bitten by an (almost) undocumented
"feature" of outsheet (and shared by outfile): storage precision is
determined by the display format.

I'm using Stata 11.1 for Windows.  Stata 10.1 for Windows behaves the
same way.

For example, in my original data the default display format is %9.0g.
If I change the format of the relevant variable to %12.0f, and then
outsheet and insheet, everything is fine:

. format GDP %12.0f

. 
. desc GDP

              storage  display     value
variable name   type   format      label      variable label
------------------------------------------------------------
GDP             double %12.0f                 

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   10335489 |
     +-----------------+

. 
. outsheet using testoutfile.csv, replace comma

. 
. insheet using testoutfile.csv, clear case
(2 vars, 2 obs)

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   10335489 |
     +-----------------+

But if I don't change the display format, numbers >999,999 lose all but
3 (!!) digits of precision:

. desc GDP

              storage  display     value
variable name   type   format      label      variable label
------------------------------------------------------------
GDP             double %9.0g                  

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   1.03e+07 |
     +-----------------+

. 
. outsheet using testoutfile.csv, replace comma

. 
. insheet using testoutfile.csv, clear case
(2 vars, 2 obs)

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   10300000 |
     +-----------------+


This behaviour seems to be shared by outfile, even though I'm using
Stata's dictionary to specify the datatype:

. desc GDP

              storage  display     value
variable name   type   format      label      variable label
------------------------------------------------------------
GDP             double %9.0g                  

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   1.03e+07 |
     +-----------------+

. 
. outfile using testoutfile.csv, replace dict

. 
. infile using testoutfile.csv, clear

dictionary {
        int    Year              `"Year"'
        double GDP
}

(2 observations read)

. 
. list

     +-----------------+
     | Year        GDP |
     |-----------------|
  1. | 1995    9963191 |
  2. | 1996   10300000 |
     +-----------------+

So even though Stata's dictionary format notes that GDP is a double, all
but 3 digits of precision are lost.

What's happening is that with the default display width of 9 digits,
after 999,999 Stata switches to exponential notation, so it records the
1996 value above as 1.03e+07.

There's no direct mention of this limitation in the documentation for
outsheet.  There is something about this in the manual documentation for
outfile, but I had to read between the lines to work out the
implications:

"Numeric variables are output right-justified in the field width
specified by their display format."

The implications for precision follow from this, but I think I can be
forgiven for missing it.

I'm posting to the list because I think it's important enough to bring
to people's attention.  If others feel similarly, perhaps StataCorp can
update the online documentation and manual to point this out, or even
add options to outsheet and outfile to control precision independently
of formatting.

--Mark


-- 
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index