Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Importing foreign files - mild suggestion

From   "Allan Reese (Cefas)" <>
To   <>
Subject   st: Importing foreign files - mild suggestion
Date   Thu, 6 Jul 2006 10:34:56 +0100

Peggy Chrisman wrote originally asking for advice on how to read an SPSS file into Stata:
"I have received several SPSS files. I do not have SPSS nor do I have the Stata file transfer program."

The discussion spun off in several directions but all apparently based on the premise that it was Peggy's role here to decide what she had been given.  Eg, my esteemed friend Nick Cox wrote (5 Iúil 2006):
> In addition, the Stata command -hexdump- lets you look
> at any kind of file and see what kind of beast you have.
and equally E F Ronán Conroy (who seems to have translated Nick's dateline to Gaelic) added:
> Many people feel that they won't understand what's inside files in  
> proprietory formats. In fact, you can figure out a lot.

With respect, you shouldn't have to.  And this is not a sensible practice to advocate to students.  If sent files in a proprietary format with which you are not familiar, the first response should be to ask the SENDER to convert the file to a common format.  Approaching the task as an academic puzzle is (a) time-wasting and (b) error prone.  We (the IT/data handling community) ought to be promoting a culture where people understand the implications of using proprietary or local standards.

Datasets also need metadata.  While .DTA (Stata) file or .SAV (SPSS) files provide a mechanism to include much metadata in the single file, how often does that happen?

This has been a hobbyhorse for many years, sparked in particular by a senior manager who naively complained to the computing service where I worked that he didn't want to understand how to send emails, he just wanted to type in Word and "press a button" so that the recipient would see exactly the same on their screen (assuming they had Word and the same character set).  Another senior manager "saved us money" by ordering a PC and then exploded because it didn't have Word - he'd never thought of software as a cost item. A UK national committee of librarians in the 1990s described a scenario where an undergraduate would not go to lectures, "she was in her study bedroom and typed 'Gunter Grass' into the search engine to receive all the information she needed."  Whether the German sources distinguished Gunter with/without an umlaut was not considered.

Peggy, can you go back to your source and ask them to provide the data as standard Ascii fixed-format or delimited fields (see Stata or SPSS documentation) with full codebook and metadata?  If they can't, ask them how they know what values they are working with anyway (but prepare to dodge missiles!)

Allan Reese

This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index