Convert EBCDIC to ASCII in SSIS

    Converting EBCDIC to ASCII in SSIS 2008 can seem like a daunting task. There is a lot of manual work involved in creating a data flow to move data from an EBCDIC system like AS400 or DB2, to an ASCII system like a SQL database. This can be accomplished, and it can be done with no script tasks or script components. Just using the native built in components with the proper set up will convert EBCDIC to ASCII. This is true even if the file has Packed Decimal (Comp-3) Fields.
    The first thing you need is a definition of the EBCDIC file. You will need to know where the columns are at in the file and the column widths. You will also need to know the type of data in the columns such as Packed, Regular, or Zoned. The File in this example is 1100 characters across. You will only define a few columns to simplify the example.
    Here is the column layout of the EBCDIC file:

    You will need to create a Flat File Connection in the connection manager of your package. Set the Format to Fixed Width and the Code page to the proper code page of your EBCDIC file. The most popular in the US is 1140 IBM EBCDIC US/Canada 37 + Euro.

Click on the Columns node in the Flat File Connection Manager editor. This is the long tedious part of the process. You have to map every field manually. That is why you need the definition of the file to tell you where the columns are and the data types of the columns.
The First Step is to set the Row Width at the bottom of the Editor. Then you will need to click in the column screen to add the column dividers. This need to be checked and double checked because it cannot be altered later. If you make a mistake here and close the connection manager, you must click reset columns and start all over again. The editor does not let you make changes to the columns after it is saved. Hopefully this is changed in a future release.

For the Regular fields you will see the data showing normally. In the Packed Decimal Fields you will see funny characters, and the zoned fields will show characters also. If everything is strange characters then you might have the wrong codepage selected. Go back to the general page and try changing the codepage if the data is all unreadable.

    After setting the columns you will need to set the columns types and name the columns. Click on the advanced node in the Flat File Editor and Select the first column. In this example you are only defining five of the columns. So there will be columns between each column you are ignoring. Each column that is Regular or Zoned data set the data type to Unicode String. For the Packed Decimals set the data type to Byte Stream and the output width to a larger number than the input to handle all of the extra data in the packed field. Double the size should be enough but it does not hurt to increase it a little more to be safe. The packed field width in this example is 5 and the output is 50.

    Click on the Preview node in the editor and your preview should show readable data in all the fields except the package decimal (Comp-3) fields. In this example the Invoice date and the Invoice Amount are packed decimal and therefore have the strange charters showing. Click ok to save the connection manager.

    In a data flow drag in a Flat File source and select the newly created connection manager. In this example you unchecked the undefined columns in the columns node of the flat file source. This way you are only dealing with the columns you care about in the data flow.

    You need to make one more change to the flat file source. Right click on the Flat File source and select Show Advanced Editor. Click the Input and Output Properties tab on the top right. Click the plus to expand the Flat File Source output and the Output Columns. Select the packed decimal fields and change the UseBinaryFormat property to True. Now click Ok to save the source.

    Below is the data viewer right after the Flat File Source. You can see the hexadecimal fields coming through and they are readable. The Vendor Name is readable and the rate unit is a number with a trailing character.

The next transform will be a data conversion transform. The fields that are packed decimal (comp-3) can now be converted to a Unicode string. This will create a new column for each field. These new fields will be in text format and can be edited with derived columns transforms.

Below is a data viewer image showing the data after the data conversion transform. Notice the hexadecimal value in the original columns and the string values in the new converted columns. These new columns can now be altered with derived columns to convert the dates and money amounts before they are written to a SQL database.

The next transform will be a Derived Column transform. You will use this to split the data in the Unicode string columns. The packed decimal fields have a string of numbers and a single character at the last digit. This digit indicates the sign of the number.

C = Signed Positive
D =Signed Negative
F = Unsigned

This first derived column is simply going to split the sign character from the numbers. The date field has this sign character, but it is not used, so all you need from that column is the number. The RateUnit column is a Zoned Decimal so you will need to split the last character from that also. Here are the expressions used to accomplish this task. You also multiple the amount field by the proper number of decimal places that are defined in the field. In this example it has 2 decimal places.

(DT_I4)(SUBSTRING(wstrInvoiceAmount,1,LEN(wstrInvoiceAmount) – 1)) * .01
SUBSTRING(REVERSE( [wstrInvoiceAmount] ),1,1)
(SUBSTRING( [wstrInvoiceDate] ,2,LEN( [wstrInvoiceDate] ) – 1))
SUBSTRING(REVERSE(RateUnit),1,1)
(SUBSTRING(RateUnit,1,LEN(RateUnit) – 1))

    From this derived column you get four new columns, the number from the amount and rate, and the sign character from the amount and rate. The date column has the leading zero and the ending character removed from the date string.
    In the next derived column transform you will set the sign for the amount and rate column and convert the string date to the date data type. Below is the code to accomplish this. In this code you are checking the sign character for the letter “D” on the amount. If it is “D” then you multiply the amount by negative one, else you leave it positive. For the Rate, which is a zoned digit, you check the letter for the letter A-I and the curly bracket {, who indicate a positive number, and I-Z and the other curly bracket } indicate a negative number. There are other possibilities in zoned digits, but in this example you are assuming this is the only possibility. You could save the letter A-I and the bracket in a variable and use the variable instead of hard coding in the letters in the expressions. This would be the best practice. The rate may also need to be multiplied by .01 to set the decimal places. In this example it is an integer.
    The date string is broke up into substrings and the hyphens are added then all of that is type cast to a date. You can go into the configure error output and set it to ignore errors for bad dates. If you set it to ignore errors any bad dates will be null. You could use and check for nulls in another derived columns afterwards and set it to a default date if you need.

wstrInvoiceSign == “D” ? numInvoiceAmount * -1 : numInvoiceAmount
(DT_DBDATE)(SUBSTRING(wstrInvoiceDate,1,4) + “-” + SUBSTRING(wstrInvoiceDate,5,2) + “-” + SUBSTRING(wstrInvoiceDate,7,2))
FINDSTRING(“{ABCDEFGHIJ”,wstrRateChar,1) > 0 ? [intRate] : [intRate] *-1
wstrInvoiceSign == “D” ? numInvoiceAmount * -1 : numInvoiceAmount

Here is the final Data viewer and these columns can be mapped to a SQL table. If you want the columns to be VarChar instead of NVarChar then you will need to type cast the columns with the non Unicode type cast in the last derived column (DT_STR). You might also need to set some of the derived columns to ignore errors so bad dates or bad numbers can be passed through as nulls.

Let me know if you have any EBCDIC to ASCII issues. I love a challenge. If you are an EBCDIC master and you see anything you have wrong here please let me know.

10 Responses to Convert EBCDIC to ASCII in SSIS

Nikki says:

October 20, 2011 at 7:15 pm

Hi, I need to export data from my SQL Server database in packed comp-3 edcdic format. I have the added problem that each record is not fixed, by which I mean that the basic record is a fixed 91, but a column in this defines the number of slices to follow between 1 and 30 slice records of 452 each.
I have successfully read the data in using a script component and converted/unpacked each field, checking the basic record for how many slices to read. Having manipulated the data I now need to pack up and export the data again … any ideas?

Brandon Forest says:

July 9, 2012 at 12:17 pm

Mike,

I just started a new position and one of the developers threw me a SSIS ETL to start. He’s got the Flat File Source configured with EBDIC 37 code page, which is giving a clean preview. The sink is a OLE DB Destination to a SQL 2008 R2 table with 4 varchar columns. The preview for this is still in EBDIC format (gobbledegook). I’ve never done this particular conversion before, so any suggestions would be appreciated.

Thanks!

Brandon_Forest@sbcglobal.net

Elroy says:

June 8, 2013 at 5:45 am

Daunting indeed. Curly braces and letters not only indicate the sign, but also include the least significant digit. { and A thru I represent positive numbers 0 thru 9. } and J thru Q represent negative numbers 0 thru 9. So simply stripping those characters is very bad.

Anjan says:

June 28, 2013 at 4:37 pm

Hey Mike,

Every thing look sgood in your explaination. But how are you handling the string characters i mean names. If in the flat file assume there are comp3 data and also name fields like you have in your example. When you use 1140 code then the Name field datat type also changes to 1140 code and it will be string. But how will you load into the sql table which has name field with varchar(25)..

You are only dealing with dat efields. But there might be some other fields also whic hneeds to be concentrated i guess. I hope You understood.

thomas pearson says:

August 20, 2013 at 10:48 am

is there a conversion technique in SQL to convert a base64 element of an xml file that is encoded from codepage 1047 ebcdic?

causacn says:

May 3, 2015 at 9:46 am

A big challenge of how to handle the case if the packed fields in cobol record layout is not determined, which basically means there are a number of record layout in mainframe cobol file and the way of fixed-with flat file can not work.

Any thought?

causacn says:

May 6, 2015 at 10:22 pm

Hi Mike,

Any experiece of loading SAS mainframe file (fixed-length) to SQL server table. The file is not PC SAS, it it mainframe SAS?

pandi says:

February 1, 2016 at 11:49 pm

could you please send the package with source data file and table structure

Enaya says:

February 3, 2016 at 11:17 am

Hi,

I am facing the issue in transforming the data, my data is in EBCDIC format i converted the data into ASCII but the output is supposed to be in HEX I tried it using this artticle but not able to load the data into table

Enaya says:

February 3, 2016 at 11:20 am

Hi

I am unable to get the dat ainto table my file is in EBCDIC format the output is supposed to be in HEX format, I am not able to see the output even not getting any error.