[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: tostring double wishlist
This is a question for Stata Corp, but
for historical reasons I will comment,
even though I can't see your "wishlist",
and it seems that -tostring- gave you precisely
what you asked for, i.e. the consequence
of using -force- is clearly that you might
mangle your variable.
Here are some pertinent extracts from the help:
=================== from help for tostring
Either generate() or replace must be specified.
If converting any numeric variable to
string would result in loss of information, no
variable will be produced unless force is
specified. For more details, see under force
force specifies that conversions entailing loss
of information may be forced. Loss of
information means one of two circumstances:
(1) The result of real(string(varname,
"format")) is not equal to varname, i.e. the
conversion is not reversible without loss
of information. (2) replace was specified
but a variable has associated value labels.
In circumstance (1), it is usually best to
specify usedisplayformat or format(). In
circumstance (2), value labels will be ignored
in a forced conversion.
format() specifies the use of a numeric format
as an argument to the string() function,
which controls the conversion of the numeric
variable to string. For example, a
format of %7.2f specifies that numbers are
to be rounded to 2 decimal places before
conversion to string. See Remarks below and
help on functions and format.
Conversion of numeric data to string equivalents
can be problematic. Stata, like most
software, holds numeric data to finite precision
and in binary form. See the discussion
in [U] 16.10 Precision and problems therein.
If no format() is specified, tostring uses
the format %12.0g. This format is, in particular,
sufficient to convert integers held as
bytes, ints, or longs to string equivalent
without loss of precision.
However, users will in many cases need to
specify a format themselves, especially when the
numeric data have fractional parts and for some
reason a conversion to string is required.
In your case I think this all boils down to one
issue: that the advertised default conversion
format %12.0g is not appropriate for your
variable (which, as clearly stated, is a
double). Forget -tostring- and look
at the fundamentals:
. di string(019003754630, "%12.0g")
. di string(019003754630, "%012.0f")
I therefore recommend
tostring ssuidn, gen(ssuidc) format(%012.0f)
More generally, note that no default
format can be appropriate for all possibilities.
It could be argued that -tostring- should
look at the data and make an intelligent
guess at what the conversion format should be, but
personally I'd like to hear the argument for that
in detail. The program authors evidently
took the view that a single default format
and the possibility of user over-ride provided
enough flexibility (and scrutable behaviour).
> I had a bit more trouble than I anticipated converting
> a 12-digit double to a string. The string and number
> are below:
> | ssuid ssuidn |
> 1. | 019003754630 1.900e+10 |
> A simple tostring command gives the results below:
> . tostring ssuidn, gen(ssuidc) force
> ssuidc generated as str11
> ssuidc was forced to string; some loss of information
> . list ssuidc in 1/1
> | ssuidc |
> 1. | 1.90038e+10 |
> expressed in scientific notation rather than like
> '019003754630' above.
* For searches and help try: