Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Converting from EBCDIC to ASCII


From   "McKenna, Timothy" <Timothy.McKenna@nera.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Converting from EBCDIC to ASCII
Date   Thu, 14 Oct 2004 17:52:36 -0400

You could also use the text editor VEDIT (www.vedit.com).  It allows you to view a document using a variety of translations from the hexadecimal.

1.  Load the file as normal (it will still look like garbage as the default is ASCII)
2.  Use the View -> Toggle Display Mode menu option (or Alt-d) about 8 times until it gets to EBCDIC mode (the current mode is displayed in the lower left of the app window).
3.  You will probably need to set how many lines are in the line length.  This information should be with whatever data dictionary you got, if not it is easy enough to find it by experimenting.
4.  Then save it using the WYSISYG conversion option of the file menu.

There is a description of this whole process in the help menu.

If you have a large amount of similar files, I would (and do) use SAS to convert them.  VEDIT can not handle packed decimal, so you need to use SAS there as well.

-Tim




-----Original Message-----
From: Daniel Feenberg [mailto:feenberg@nber.org]
Sent: Thursday, October 14, 2004 4:39 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Converting from EBCDIC to ASCII


On Thu, 14 Oct 2004, Daniel Egan wrote:

> Hello list, 
> 
> I have a huge dataset which is in EBCDIC that I need to get into
> Stata. I have been told that I can do this using SAS as an
> intermediary, translating from EBCDIC to ASCII.

We use SAS often for this, especially where the EBCDIC data includes
packed decimal or zoned decimal data. In that case, SAS is the only
package we have available (Bill Gould, please take note). A disadvantage
of SAS is the 200 character limit on character variables, which means you
have to divide the record up into chunks and translate each separately.

> Does anyone have any experience with this, or can point me towards a
> generic translator? I just want to see if there is any easier way than
> SAS.

The dd command in Unix can do this conversion for most situations. The
chief problem we have noticed is that non-EBCDIC values in the input data
are dropped with no placeholder substituted. If your EBCDIC dataset is
fixed format, and (for example) missing fields are packed with nulls (very
common), the result will be a shorter and unusable dataset.

The command is:

    dd conv=ascii <ebcdic.in >ascii.out

or 

   dd conv=ibm   ...

I don't really know the difference between the two conversions.


Daniel Feenberg
feenberg isat nber dotte org

> 
> 
> Cheers, and thanks, 
> 
> 
> Dan Egan


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/ 
  
_____________________________________________________________ 
  
This e-mail and any attachments may be confidential or legally privileged.  If you received this message in error or are not the intended recipient, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained herein.  Please inform us of the erroneous delivery by return e-mail.  
  
Thank you for your cooperation. 
  
_____________________________________________________________ 
 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index