Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: tostring double wishlist

From   "Nick Cox" <>
To   <>
Subject   st: RE: tostring double wishlist
Date   Tue, 4 Nov 2003 18:42:45 -0000

This is a question for Stata Corp, but 
for historical reasons I will comment, 
even though I can't see your "wishlist", 
and it seems that -tostring- gave you precisely 
what you asked for, i.e. the consequence 
of using -force- is clearly that you might 
mangle your variable. 

Here are some pertinent extracts from the help: 

=================== from help for tostring 

Either generate() or replace must be specified.  
If converting any numeric variable to
string would result in loss of information, no 
variable will be produced unless force is
specified.  For more details, see under force 

force specifies that conversions entailing loss 
of information may be forced.  Loss of
information means one of two circumstances:  
(1) The result of real(string(varname,
"format")) is not equal to varname, i.e. the 
conversion is not reversible without loss
of information.  (2) replace was specified 
but a variable has associated value labels.
In circumstance (1), it is usually best to 
specify usedisplayformat or format().  In
circumstance (2), value labels will be ignored 
in a forced conversion.  

format() specifies the use of a numeric format 
as an argument to the string() function,
which controls the conversion of the numeric 
variable to string.  For example, a
format of %7.2f specifies that numbers are 
to be rounded to 2 decimal places before
conversion to string.  See Remarks below and 
help on functions and format.

Conversion of numeric data to string equivalents 
can be problematic.  Stata, like most
software, holds numeric data to finite precision 
and in binary form.  See the discussion
in [U] 16.10 Precision and problems therein.  
If no format() is specified, tostring uses
the format %12.0g.  This format is, in particular, 
sufficient to convert integers held as
bytes, ints, or longs to string equivalent 
without loss of precision.

However, users will in many cases need to 
specify a format themselves, especially when the
numeric data have fractional parts and for some 
reason a conversion to string is required.


In your case I think this all boils down to one 
issue: that the advertised default conversion 
format %12.0g is not appropriate for your 
variable (which, as clearly stated, is a 
double). Forget -tostring- and look 
at the fundamentals: 

. di string(019003754630, "%12.0g")

. di string(019003754630, "%012.0f")

I therefore recommend 

tostring ssuidn, gen(ssuidc) format(%012.0f) 

More generally, note that no default
format can be appropriate for all possibilities. 
It could be argued that -tostring- should 
look at the data and make an intelligent 
guess at what the conversion format should be, but 
personally I'd like to hear the argument for that 
in detail. The program authors evidently 
took the view that a single default format
and the possibility of user over-ride provided
enough flexibility (and scrutable behaviour).


> I had a bit more trouble than I anticipated converting
> a 12-digit double to a string.  The string and number
> are below:
>      +--------------------------+
>      |        ssuid      ssuidn |
>      |--------------------------|
>   1. | 019003754630   1.900e+10 |
>      +--------------------------+
> A simple tostring command gives the results below:
> . tostring ssuidn, gen(ssuidc) force
> ssuidc generated as str11
> ssuidc was forced to string; some loss of information
> . list ssuidc in 1/1
>      +-------------+
>      |      ssuidc |
>      |-------------|
>   1. | 1.90038e+10 |
>      +-------------+
> expressed in scientific notation rather than like
> '019003754630' above.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index