Loop Through Excel Files in SSIS

You can loop through excel files using SSIS. This will use the For Each Loop container and a data flow task.

clip_image002

First Create a variable named strExcelfile as a string variable; you can leave the value blank.

clip_image004

Next, drag in a For Each Loop. Set it to For Each File, and point it to the folder where the excel files exist and type .xls or .xlsx for the file type. In this example the excel files are in c:\test\excelfiles. There are three files named USCustomers1.xls, USCustomers2.xls, and USCustomers3.xls.

clip_image006

Next, drag in a data flow task and drop it in the loop container. Open the Data flow and drag in an Excel Source. Set the Source to one of the excel files in the folder above. This will set the column names and the metadata for the files. Each file in the loop must have the same lay out. If they have different column widths are data types then you cannot use this technique.

Now you will map the excel file to the connection. Click on the Excel connection in the connection manager. This was created when you created the Excel source in the Data Flow. Click on Expressions in the properties windows and open the expressions editor for the Excel connection manager. Select the File Path property and drag in the strExcelFile variable.

clip_image008

One Last step is to set the data flow to delay validation. This is so the data flow will not check for the excel file until after the file name is loaded into the variable.

clip_image010

Advertisements
This entry was posted in SSIS, Syndication and tagged . Bookmark the permalink.

17 Responses to Loop Through Excel Files in SSIS

  1. pure.logic says:

    after quite some time elaborating, this method did the trick,
    thank you!

  2. Jaime says:

    It’s important to note that the “Delay Validation”:True setting, has to be on the Data Flow Task itself, not on the Excel Connection Manager.

    • MikeDavisSQL says:

      Thanks for clarifying this.

    • Frank says:

      This is a VERY important detail. You will need to set the “DelayValidation” to True in Expressions for both the Data Flow Task, and any archive file system task if you have one. Click on the Task, look at properties, click on “Expressions”, and add the “DelayValidation” from the drop down menue, then type “True” without the quotes for the expression.

  3. Tyrone says:

    I have follwed your instructions and got the following error on the Excel Source:
    Information: 0x4004300A at Data Flow Task, SSIS.Pipeline: Validation phase is beginning.
    Error: 0xC0202009 at CSIDataMiningNewTesting3B, Connection manager “Excel Connection Manager 1”: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.
    An OLE DB record is available. Source: “Microsoft JET Database Engine” Hresult: 0x80004005 Description: “The Microsoft Jet database engine cannot open the file ”. It is already opened exclusively by another user, or you need permission to view its data.”.
    Error: 0xC020801C at Data Flow Task, Excel Source CSI Forms [190]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager “Excel Connection Manager 1” failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed.
    Error: 0xC0047017 at Data Flow Task, SSIS.Pipeline: component “Excel Source CSI Forms” (190) failed validation and returned error code 0xC020801C.
    Error: 0xC004700C at Data Flow Task, SSIS.Pipeline: One or more component failed validation.
    Error: 0xC0024107 at Data Flow Task: There were errors during task validation.
    Warning: 0x80019002 at Foreach Loop Container CSI-Access Database Table Loading: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (5) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
    Warning: 0x80019002 at CSIDataMiningNewTesting3B: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (5) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
    SSIS package “CSIDataMiningNewTesting3B.dtsx” finished: Failure.
    The program ‘[1664] CSIDataMiningNewTesting3B.dtsx: DTS’ has exited with code 0 (0x0).

  4. Tyrone says:

    Thanks Mike, I think SQL Server 2008 R2 is installed as 64bit, will I have to uninstall and re-install in 32Bit? I have change the Debug Options to Run64BitRuntime to equal “False” but I get the same results

    • MikeDavisSQL says:

      Actually Tyrone it looks like the file is open already or a permissions issues Your error says “The Microsoft Jet database engine cannot open the file ”. It is already opened exclusively by another user, or you need permission to view its data.”.”

  5. Tyrone says:

    I thought that also Mike, so I checked to make sure I did not have it open in any other window. This has been a great pain for us here and has taken a lot of time to research. I’d like to think you so much for your replies and suggestions. In my opinion your the only one I have gotten any good information from, I’m hoping to catch up with you at an event so we can speak. I’d alos like to take any class you might be having soon…

    I will check the fact that this file might be open somewhere but I’m pretty sure that it’s not. I’ll be touching base with you very soon (tomorrow…LOL) this project is so far behind schedule because of this stumbling block… Than you so much for your help…

  6. Tyrone says:

    Mike the strange thing is there is a derived column object in the Data Flod Task which is reading the next Excel file and creating a new column, the other columns still from the first Ecel file read…

    This makes me think that your first reply may be correct there is an issue with 32Bit vs 64Bit, is the best course of action to uninstall SQL 2008 R2 SSIS and make sure the 32Bit tools are installed?

  7. nitingupta021@hotmail.com says:

    thanks Mike, but it picks same file again and again using the loop count = number of .xls files in specified folder path. is there anything missing here?

  8. nitin says:

    missed to mention in earlier post that i executed it by setting the first file name (with file path) to the variable, if i do not set that then validation error comes up.
    thanks.

  9. emmo says:

    Hi Mike, I followed this through but without success.

    If I have 5 files in the excel folder, my app just succeeds in loading the first excel file (mentioned in the connection) five times..
    So it loops through the files – the names appear in the variables.. I print the names to the Output window, but it only seems to process the first file again and again – for as many times as it finds files that match it’s pattern.

    I’m just… ????? ??? ? confused.
    Emmet

    • Gil says:

      I experienced the same issue of the same file being processed once for each file in the directory I was looping through. The solution is to set the ‘Evaluate as expression’ setting on the file name parameter to ‘FALSE’. Counter-intuitive, but it does the trick.

  10. Richard says:

    Hi Mike /Gil
    I’m currently experiencing the same problem as Emmet and have yet to find a solution.

    I have ‘Evaluate as expression’ set to ‘False’ for my file name parameter,
    I have ‘Delay validation’ set to ‘True’ for both my Data Flow Task and Connecton Manager
    and I have tried setting ‘Run64BitRuntime’ to equal ‘True’ and ‘False’.

    The Foreach Loop Container continues to read through the first file four times, returning a different file name in the Derived Column I created for each iteration… Do you have any further advice on how to fix this?

    Thanks,
    Richard

  11. Richard says:

    “Hi Mike /Gil
    I’m currently experiencing the same problem as Emmet and have yet to find a solution.

    I have ‘Evaluate as expression’ set to ‘False’ for my file name parameter,
    I have ‘Delay validation’ set to ‘True’ for both my Data Flow Task and Connecton Manager
    and I have tried setting ‘Run64BitRuntime’ to equal ‘True’ and ‘False’.

    The Foreach Loop Container continues to read through the first file four times, returning a different file name in the Derived Column I created for each iteration… Do you have any further advice on how to fix this?

    Thanks,
    Richard”
    —————————–
    Aplogies, this seems to occur when the ExcelFilePath is not set to the file name variable.

    The issue is actually, when the ExcelFilePath IS set as the file name variable, the connection manager can no longer find the file.
    DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager “ABC” failed with error code 0xC0202009.

    Thanks again

  12. You are my saviour! Before, I had to change the connection string (which was tedious) and I had to check for the IMEX and HDR. Yours is just simple! Thanks a bunch!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s