Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: dataset containing duplicate variables names


From   "Emma Slaymaker" <[email protected]>
To   <[email protected]>
Subject   st: dataset containing duplicate variables names
Date   Thu, 05 Apr 2007 14:06:53 +0100

Hello,

I realise that this is not supposed to happen but I have a dataset
which has several variables with the same name.  Some students of mine
inadverdantly created a dataset like this and I have replicated it. 
Does anyone know how this can happen?

The problem arises when you export data with long variables names from
EpiData to Stata (using EpiData's export function) and set Stata 6 as
the output version.  Why Stata 6...well, this is the default on the
EpiData installation used by our students.  If you change the version to
7 or higher this problem doesn't occur.

EpiData apparently knows that Stata 6 variable names should be 8
characters or less and truncates the names of any variables that exceed
this limit but it doesn't then check that all names are unique.  

I can replicate the problem with a dataset that, in EpiData, has
variables called longname1 longname2 longname3 and longname4.  Once
exported to Stata all the variables are called longname yet still
contain their original data.  Although I can see the contents of all 4
variables in list or browse I can only summon the first variable for use
in command (see output below).

What surprised me is that Stata will open the dataset.  I assume that
the variable names we see and use are not what Stata uses to refer to
the variables but the mapping between my names and Stata's seems to have
gone very wrong!  

Cheers,

Emma




. use dataepi_export_tests2,clear
(Data file created by EpiData based on dataepi_export_tests.rec)

. desc

Contains data from dataepi_export_tests2.dta
  obs:            10                          Data file created by
EpiData
                                                based on
                                               
dataepi_export_tests.rec
 vars:             5                          05 Apr 2007 11:45
 size:           190 (99.9% of memory free)
-------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
id              int    %4.0f                  ID
longname        int    %4.0f                  LONGNAME4
longname        int    %4.0f                  LONGNAME4
longname        str1   %1s                    LONGNAME4
longname        double %16.0f                 LONGNAME4
-------------------------------------------------------------------------------
Sorted by:  

. list

     +------------------------------------------------+
     | id   longname   longname   longname   longname |
     |------------------------------------------------|
  1. |  1          1          0          a          1 |
  2. |  2          0          1          b          6 |
  3. |  3          1          0          c          3 |
  4. |  4          0          1          d          2 |
  5. |  5          1          0          e          5 |
     |------------------------------------------------|
  6. |  6          0          1          f          7 |
  7. |  7          1          0          g         10 |
  8. |  8          0          1          h          4 |
  9. |  9          1          0          i         10 |
 10. | 10          0          1          j         11 |
     +------------------------------------------------+

. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          id |        10         5.5     3.02765          1         10
    longname |        10          .5    .5270463          0          1
    longname |        10          .5    .5270463          0          1
    longname |         0
    longname |        10    5.843498    3.586145    1.22553    11.0888

. tab longname longname

           |       LONGNAME4
 LONGNAME4 |         0          1 |     Total
-----------+----------------------+----------
         0 |         5          0 |         5 
         1 |         0          5 |         5 
-----------+----------------------+----------
     Total |         5          5 |        10 




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index