SSIS importing Excel files xls/xlsx - excel

I have a requirement of importing information from an excel file to a database.
I have a webpage that runs an ssis package, that picks up an excel file, and loads data into a database. The problem now lies, in the different types of excel files to be processes either xls or xlxs. SSIS excel connection manager, lets you specify which type of excel file, you will be connecting to either xls or xlxs, you can not use one connection manager for both types, this now only allows the user to always change an xlxs file to xls, then process it, is there a way to dynamically change the connection manager, based on the type of excel file,
or should i just have two different SSIS packages called, when a different type is processed.

In SSIS 2008, you can set a Connection to a 2007 Excel file (.xlsx) and then use an Expression on the Connection Manager to set the ExcelFilePath to be the value of a variable. The value of this variable can be either type, 97-2003 (.xls) or 2007 (.xlsx) and the Excel Source will work, as long as the Sheet names are the same.
I'm not sure if this is the same behaviour in SSIS 2005.

If you are running the ssis package from code already, I would imagine this should be relatively easy to do. I have been fiddling around with editing packages from code over the last week or so and it is pretty easy to modify variables etc. I know you can also access the connections and specify a dtsConfig file
using (var p = app.LoadFromSqlServer(config.PackageName, config.SqlServerName, config.UserName, config.Password, null))
{
// changing variables in code
Variables vars = p.Variables;
vars["FromDate"].Value= criteria.From;
vars["ToDate"].Value = criteria.To;
// using a configfile in code
p.ImportConfigurationFile(config.ConfigurationFile);
DTSExecResult result = p.Execute();
if (result != DTSExecResult.Success)
{
throw new ApplicationException("SSIS Package did not compelte successfully.");
}
}
You could potentially have 2 different config files one for xlsx and on for xls connections and use the appropriate config file based on the uploaded excel files extentsion.

Related

SSIS won't execute foreach loop for dynamic xlsx filename [duplicate]

This question already has answers here:
SSIS - How to loop through files in folder and get path+file names and finally execute stored Procedure with parameter as Path + Filename
(2 answers)
Closed 3 years ago.
I have a xlsx file that will be dropped into a folder on a monthly basis. The filename will change every month (filename_8292019) based on the date, to which I cannot change.
I want to build a foreach loop to pick up the xlsx file and manipulate it (load into SQL server table, the move the file to an archive folder). I cannot figure out how to do this with a dynamic filename (where the date changes.
I was able to successfully run the package when converting the xlsx to CSV, and also when pointing directly to the xlsx filename.
[Flat File Destination [219]] Error: Cannot open the datafile "filename"
OR errors relating to file not found
The Files: entry on the Collection tab of the Foreach Loop container will accept wildcard characters.
The general pattern here is to create a variable, say, FileName. Set your Files: to something like:
Files:
BaseFileName*
or, if you want to be sure to only pick up spreadsheets, maybe:
Files:
BaseFileName*.xlsx
Select either Name and extension or Fully qualified, which will include the full file path. I usually just use Name and extension and put the file path into another variable so when Ops tells me they're moving my drop location, I can change a parameter instead of editing the package. This step tells the container to remember the name of the file it just found so you can use it later for a variable mapping.
On the Variable Mappings tab, select your variable name and assign it to Index 0.
Then, for each spreadsheet, the container will loop, pick up the name of the first file it finds that matches your pattern, and assign the full name, with the date extension (and path, if you go that way), to your variable. Pass the variable as in input parameter to the tasks inside the loop and use that to process the file, including moving it to the archive, or you'll get yourself into an infinite loop, processing the same file(s) over and over. <--Does that sound like the voice of experience? Yeah. Been there, done that.
Edit:
Here, the FullFilePath variable is just the folder name, without a file reference. (Red variable to red entry in the Folder box).
The FileBaseName variable drives what shows up in the Files box. (Blue to blue).
Another variable picks up the actual file name, with the date extension. Later, say in a File System Task, if I need the folder & file name together, I concatenate the variables.
As far as the Excel Connection Manager error you're getting, unfortunately I'm no help. I don't use it. We have SentryOne's Task Factory for SSIS which includes a much more resilient Excel connector.

Check if Excel file saved in compatibility mode without using Excel

I have recently been experimenting with perl and some modules to read Excel files and in particular the format of thier cells.
For example I wrote a piece of perl code that used the module ParseExcel to read a cells background colour. However while testing I noticed that for certain files the colour returned by my perl program did not match the colour reported by Excel. Eventually I found the reason for this was that the file I was reading was a .xls file saved in compatibility mode. Basically the creator of the file had used the functionality of Excel .xlsx type files (2007+) to colour some of the cells and then saved the file with the old .xls file extension that did not support the colours chosen.
So my question: Is there any way to tell whether a given .xls file (or any other old Excel file format) has been saved in compatibility mode without usung Excel to find out? The reason I ask is that I am working under a linux environment and can't use any windows tools to analyse the files.
Furthermore, if one could identify that a given Excel file has, indeed, been saved in compatibiity mode is there any way of knowing how the original colours were mapped to the ones that my program is telling me?
Many thanks for any help on this.
I do not think that you can do this using Spreadsheet::ParseExcel. I have tried saving an xls file with a color from an .xlsx and saving it with 2003 compatibility. Then comparing it with an empty .xls of 2003 and I do not see any difference in my files.
You can try the following code to debug it with your own files trying to find a difference that you could use:
use strict;
use warnings;
use Spreadsheet::ParseExcel;
use Data::Dumper;
use JSON;
use Test::More tests => 1;
my $file_1 = 'test_xls.xls';
my $file_2 = 'compat_xls.xls';
my #files = (
$file_1,
$file_2,
);
my #workbooks;
foreach my $file (#files){
print("\n\nReading $file\n");
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse($file);
# print Dumper($workbook->{PkgStr});
delete $workbook->{PkgStr};
delete $workbook->{File};
delete $workbook->{Worksheet}->[0]->{MinRow};
delete $workbook->{Worksheet}->[0]->{RowHeight};
delete $workbook->{Worksheet}->[0]->{_Pos};
delete $workbook->{Worksheet}->[0]->{MinCol};
delete $workbook->{Worksheet}->[0]->{MaxCol};
delete $workbook->{Worksheet}->[0]->{MaxRow};
delete $workbook->{Worksheet}->[0]->{Cells};
delete $workbook->{Format}->[62];
push #workbooks, $workbook;
}
my ($ok, $stack) = is_deeply($workbooks[0], $workbooks[1]);
my $diag = explain($stack);
print(Dumper($diag));

MATLAB 2015b broken ActiveX/Excel Controls

Having an issue involving the creation of ActiveX handles using MATLAB 2015b. Before updating (from 2013a) I used to create a new Excel application handle using the following 'try catch':
global Excel
try
Excel = actxGetRunningServer('Excel.Application') ;
catch
Excel = actxserver('Excel.Application');
end
Since updating to 2015b, the code still runs through without error, but now the Excel handle created, whilst still of type Excel_Application, has no properties. Calling Excel.get returns a struct with no fields.
Apart from the update, there haven't been any other changes made to the code, and the version of MS Office hasn't changed.
Have there been any changes in the way MATLAB handles the ActiveX interface, or is there something wrong with my code?

failed to import excel to sql server using ssis package

Hi im developping an ssis package that imports excel files (.xlsx) from an ftp server to a local folder then they are imported to a sql server table . I'm using a foreach mapping to the name of files. The import from the ftp server to local work fine, but the import from the local folder to the sql table failed.
It seems that I have a problem in excel source. These are the errors:
Start SSIS package "Package.dtsx."
Information: 0x1 at Script Task, C # My Message: System.Collections.ArrayList
Information: 0x4004300A at Data Flow Task, SSIS.Pipeline: Validation phase begins.
Error: 0xC0202009 at Data Flow Task, Excel Source [1]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E37.
Error: 0xC02020E8 to Flow Task data, Excel Source [1]: Failed to open a rowset for "Sheet1 $". Verify that the object exists in the database.
Error: 0xC004706B to Flow Task data SSIS.Pipeline: validation failed "component" Excel Source "(1)". Returned validation status "VS_ISBROKEN."
Error: 0xC004700C to Flow Task data SSIS.Pipeline: Failed to validate one or more components.
Error: 0xC0024107 at Data Flow Task: There were errors during task validation.
Warning: 0x80019002 at Foreach Loop Container: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors (6) reached the maximum allowed (1); leading to a failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the value of MaximumErrorCount or fix the errors.
SSIS package "Package.dtsx" finished: Success.
The program '[5504] Package.dtsx: DTS' has exited with code 0 (0x0).
As configuration I have:
For the excel manager connexion, I made an expression for connectionString = #[User::variable1] + #[User::DOWNLOAD_DIRECTORY_LOCAL] + #[User::FTP_FILE_URL] + #[User::variable2]
variable 1 =Provider=Microsoft.ACE.OLEDB.12.0;Data Source=
variable 2 = ;Extended Properties="EXCEL 12.0;HDR=YES";
I made also the delay validation property to true for data flow task, ftp task, foreach task and excel connection.
I just wrote a package to do the very same thing myself. Things to check in this order:
in your Excel Data Connection have you browsed to the excel files in your local folder (once they are there) and selected one (you need to copy one in there while developing)? so when you go to your excel source object inside your Data Flow Task (inside the For Each) you can select the Excel Data Connection and then see Sheet$1 under "name of the excel sheet"?
Once you are sure you have done above have you then right-clicked on the Excel Data Connection and in the Expressions property added ExcelFilePath = #[User::FTP_FILE_URL]? (note you need to select 'Fully Qualified' under Retrieve File Name on the Collection tab of the For Each container)
in your Excel Data Connection have you selected the right version (Excel 2007) for the .xlsx files or Excel 2003 for .xls? I noticed a small bug where when I changed the filename it defaulted back to 2007, I had to manually change it back (again) to 2003.
Check at least one workbook exists in the folder before the step runs. There is some code around here about how to add a script task to validate at least one file being in User::DOWNLOAD_DIRECTORY_LOCAL.
I got a load of errors about the driver for Microsoft.ACE.OLEDB.12.0, plus had issues with a 64-bit server and had to wrap the package in a job and check the 'use 32-bit runtime' option under execution options in the job properties. Check the driver is working OK (although it usually gives a specific driver error if you haven't got it set up right).
Um that's it offhand just quickly before I head home. Let me know if it works or is still a fail..
The question you should ask yourself when having this issue is:
Where do I run my dstx file from ?
Is it from Microsoft Visual Studio ?
Is it from a SQL Agent ?
Is it from the Integration Services Package Execution Utiliy ?
Then refine your question to find the answer on the forums.

How to update a connection string of an excel file from a script (PS)

We have an excel file which contains a connection to a database to retreive data (with a select statement).
We want to update via a (preferrably powershell) script the connection string of that file to make it query another server instead.
So for exemple :
I have report.xlsx file which connects to server A.
I run update-connection.ps1
And when I open report.xlsx it now connects to server B.
Any idea how we could do that?
Thanks.
It should be fairly easy if you decide (are allowed) to store the connection (server name) in a worksheet. Your VBA code can dynamically build the connection string based on the value of a cell. (I would probably create a named range and use it in the code).
I don't know PowerShell but the code can look something like:
$workbook.Range("Server").Value2 = "PROD_01"
You can make the worksheet hidden if you wish, but it is not a serious security.
You could try automating Excel via PowerShell, as in this article: http://kentfinkle.com/PowershellAndExcel.aspx
If you don't want to automate Excel then you could try using something like ClosedXML in your PowerShell script: http://closedxml.codeplex.com/
You can parse the connectionstring with System.Data.Common.DbConnectionStringBuilder. Check this SO thread:
Powershell regex for connectionStrings?

Resources