Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Converting from EBCDIC to ASCII


From   Daniel Feenberg <feenberg@nber.org>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Converting from EBCDIC to ASCII
Date   Thu, 14 Oct 2004 16:39:19 -0400 (EDT)

On Thu, 14 Oct 2004, Daniel Egan wrote:

> Hello list, 
> 
> I have a huge dataset which is in EBCDIC that I need to get into
> Stata. I have been told that I can do this using SAS as an
> intermediary, translating from EBCDIC to ASCII.

We use SAS often for this, especially where the EBCDIC data includes
packed decimal or zoned decimal data. In that case, SAS is the only
package we have available (Bill Gould, please take note). A disadvantage
of SAS is the 200 character limit on character variables, which means you
have to divide the record up into chunks and translate each separately.

> Does anyone have any experience with this, or can point me towards a
> generic translator? I just want to see if there is any easier way than
> SAS.

The dd command in Unix can do this conversion for most situations. The
chief problem we have noticed is that non-EBCDIC values in the input data
are dropped with no placeholder substituted. If your EBCDIC dataset is
fixed format, and (for example) missing fields are packed with nulls (very
common), the result will be a shorter and unusable dataset.

The command is:

    dd conv=ascii <ebcdic.in >ascii.out

or 

   dd conv=ibm   ...

I don't really know the difference between the two conversions.


Daniel Feenberg
feenberg isat nber dotte org

> 
> 
> Cheers, and thanks, 
> 
> 
> Dan Egan


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index