Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Data loss loading integers via ODBC


From   Dave Ewart <davee@ceu.ox.ac.uk>
To   StataList <statalist@hsphsun2.harvard.edu>
Subject   st: Data loss loading integers via ODBC
Date   Mon, 9 Jul 2007 14:02:24 +0100

In PostgreSQL and MySQL, there exists a smallint type: -32768 to +32767

However, Stata's 'int' type is -32,767 to 32,740

Attempting to load data over ODBC causes data loss when values in the
range 32741 to 32767 are present.

Here's an example for a smallint column such as the following in
PostgreSQL:

     f   
  -------
   32738
   32739
   32740
   32741
   32742
   32743
   32744
   32745
   32746
   32747
  (10 rows)

An ODBC load results in data loss without any error being raised:

  . odbc load [...]

  . list in 1/10

       +-------+
       |     f |
       |-------|
    1. | 32738 |
    2. | 32739 |
    3. | 32740 |
    4. |     . |
    5. |     . |
       |-------|
    6. |     . |
    7. |     . |
    8. |     . |
    9. |     . |
   10. |     . |
       +-------+

  . desc
  [...]
  --------------------------------------------------------------------------
                storage  display     value
  variable name   type   format      label      variable label
  --------------------------------------------------------------------------
  f               int    %8.0g                  
  --------------------------------------------------------------------------

So Stata/ODBC is incorrectly assuming that its 'int' type matches the
upstream 'smallint' type, which it does not.  This sounds like a very
serious bug in the Stata/ODBC handling.

In the above situation, I would expect Stata to give an error indicating
that it cannot store the values given; or possibly that it should be
smart about the situation and automatically switch to using 'long' type
for importing values of this nature.

The above problem happens for both MySQL and PostgreSQL backends.  This
version of Stata is Stata/MP 9.2 under Debian GNU/Linux with the
following ODBC libraries installed:

libiodbc2                          3.52.2-3     
libmyodbc                          3.51.11-6    
odbc-postgresql                    08.01.0200-2 
odbcinst1debian1                   2.2.11-13

The error also occurs for MySQL 'tinyint' values between 101 and 127
(which import as missing values, of type 'byte') inclusive and for
'long' integers in the range 2147483620 to 2147483647.

Dave.
-- 
Dave Ewart
davee@ceu.ox.ac.uk
Computing Manager, Cancer Epidemiology Unit
Cancer Research UK / Oxford University
PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370
Get key from http://www.ceu.ox.ac.uk/~davee/davee-ceu-ox-ac-uk.asc
N 51.7518, W 1.2016
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index