Using Execute SQL Task in SSIS - created Excel file- contains multiple columns and different data types.The issue here is INT/Money columns are displaying as a text columns in Excel even though excel table is created with Int/Money datatype
I have tried to used double precision,CY datatypes but nothing worked out.
CREATE TABLE `Employer` (
`MEMBERSHIP NUMBER` VARCHAR(30),
`RETIREE #` VARCHAR(12),
`COPID` CHAR(6),
`PERSON LAST NAME` VARCHAR(150),
`FIRST NAME` VARCHAR(150),
`RETIREE PLAN` CHAR(15),
`PLAN NAME` CHAR(200),
`BILL GROUP` INT,
`BRANCH ID` CHAR(3),
`CONTRACT NUMBER` CHAR(5),
`PBP` CHAR(3),
`BRANCH NAME` VARCHAR(150),
`COVERAGE MONTH` DATE,
`DUE AMOUNT` INT
)
Expected output should be in input datatype format.
OLE DB data types and Excel
Even if you can create Tables in Excel using OLE DB commands, excel files are not databases, where columns have data types. The OLE DB provider read only tabular data and need to specify a data type for each column. For this reason, you must specify columns data types when handling Excel, even if excel has the ability of storing multiple data types within the same column.
One of the disadvantages of using OLE DB provider to read Excel is that if a column contains mixed data types, it only reads the dominant data types and convert all remaining data types to NULL.
Importing data from Excel having Mixed Data Types in a column (SSIS)
Mixed data types in Excel column
Excel Number Format property
On the other hand there are a Cell property in excel called Number Format which is used to change the way values are shown in Excel. You can use this property to show values as Currency.
Format numbers as currency
HOW TO DISPLAY NUMBERS IN EXCEL 2010 AS CURRENCY
Solution
You can use two approaches to solve the issue:
(1) Changes Column Number Format using Script Task
You can add a Script Task and change the entire column Number Format using Interop.Excel.dll assembly.
Format an Excel column (or cell) as Text in C#?
Format Entire Excel Column Using C#
(2) Create an Excel template file
You can create an Excel file and change the columns Number Format as needed and use this file as template. You can copy the file the the destination directory, then perform data import phase.
Quick Start: Format numbers in a worksheet
I am having an issue with a program. The macro in Access imports an Excel file, then it appends an already created table with the imported data, and runs a check after to make sure that the table was appended correctly.
I have tried for hours to figure out why my imported tables are not being appended. I know that this is a formatting issue. There are six columns in my Excel table. The first three columns have Text Data Type. The fourth has Number Data Type. These are fine since the field properties match exactly for the imported table and the already existing table.
So my issue is with the last two columns. The last two columns in the existing table in Access have Date/Time Data Type. The format of these columns as shown in Field Properties is m/d/yy;# (I'm not sure what this means and an explanation of this might be helpful)
The imported table has Date/Time Data Type with format m/d/yyyy. I have tried changing the format in Excel to m/d/yy and m/d/yy;#. Both of which got the format to show correctly in Excel, just as it does in Access for the correct table. But when I import it does not keep this formatting.
The check does not register the rows with new dates and the table does not get appended. I am looking for an easy fix in Excel to change my data's formatting as there are 17 separate macros that I have to run each month.
Thanks in advance for any help anyone can provide,
(Preface: I'm already familiar with TypeGuessRows registry tweak (I have it set to 0; XL scans the whole column to determine data type) and IMEX=1 extended properties (I use this by default))
I am starting an ongoing project for a client:
Client sent 10 xlsx files, 1 per year.
Most, if not all, files have 12 sheets...1 sheet per month.
All sheets in all files have the exact same number of column with the exact same column headers in the exact same order.
Client most likely will periodically send new data (hopefully in the same format) over the next 3 years.
Looping through multiple XL files, then looping through multiple XL sheets is not a problem. I have done that many times in the past. My SSIS template for XL files is setup that way by default.
The issue I am having is when the data types for the columns can change from sheet to sheet. For example, on most sheets a date column:
No NULL/blank dates
All dates formatted as m/d/yyyy
XL/SSIS assigns date [DT_DATE] data type
...but, on some sheets within the same file, the same date column...
No NULL/blank dates
Most dates formatted as m/d/yyyy
Some dates formatted as general/number (Nov 15, 2002 = 37575)
XL/SSIS assigns Unicode string [DT_WSTR] data type
If I am not mistaken, when I run the SSIS package, it will throw an error when the data types change.
Is it possible force the data type of incoming columns (Advanced Editor for OLE DB Source > Input and Output Properties sheet > Inputs and outputs pane > OLE DB Source Output > External Columns) to unicode so the package won't error when XL/SSIS wants to change the data type? This would accommodate all current files and any future ones in case the same inconsistent formatting shows up.
Or am I forced to either:
Change all general/number formatted dates to a date format so I can import with one SSIS package
Separate all consistently formatted and inconsistently formatted sheets into 2 separate groups to be imported with 2 different SSIS packages
Once again, Thanks for any help anyone can provide,
CTB
It appears switching from one data type to another won't throw an error, just a warning...at least from [DT_DATE] to [DT_WSTR] and back.
I was not able to force the data type of the incoming column of the OLE DB Source, but I was able to set the outgoing column data type to [DT_WSTR] (Advanced Editor for OLE DB Source > Input and Output Properties sheet > Inputs and outputs pane > OLE DB Source Output > Output Columns). That way, all dates in that column were seen as unicode text in the data flow, regardless of its source.
That seemed to do the trick. I needed only one import package to import both types of sheets/files.
I hope this helps someone else in the future...
I am stuck having to query a SQL Server database that is mimicking SQL server 2000 database and no way around it.
I have a large result set of 5 fields. The last field is a memo field. The result set is so large in SSMS 2012 that I cannot select them all with headers. So I have to save to Excel csv format. In doing so it interprets data in the 5th field as either a function (“-“, “+”, “(space) –“, “(space)+”, etc at the beginning) or as multiple columns for various reasons.
So far I have
replace(ltrim(rtrim(memo)), ',', ' ') as Memo
This, of course, trims beginning and end and replaces commas with spaces. I do not want to have to build nested replaces unless I must. This is for a large audit report that is not run often so I can, if need be, use a function.
Is there a good way to make a field like this compliant with Excel so that Excel will just keep that field as one column? I would appreciate any insight.
It seems that the correct method is to append double quotes to the beginning and the end of the field value returned in the query. As I am having to right-click and output to Excel this methods works and Excel does not misinterpret the intent.
When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?
Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.
Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.
According to this blog post, the problem is that the SSIS Excel driver determines the data type for each column based on reading values of the first 8 rows:
If the top 8 records contain equal number of numeric and character types – then the priority is numeric
If the majority of top 8 records are numeric then it assigns the data type as numeric and all character values are read as NULLs
If the majority of top 8 records are of character type then it assigns the data type as string and all numeric values are read as
NULLs
The post outlines two things you can do to fix this:
First, add IMEX=1 to the end of your Excel driver connection string. This will allow Excel to read the values as Unicode. However, this is not sufficient if the data in the first 8 rows are numeric.
In the registry, change the value for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Nod\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows to 0. This will ensure that the driver looks at all the rows to determine the data type for the column.
Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.
To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.
A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of
You have more control over what you convert to.
You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)
If your Excel file contains a number in the column in question in the first row of data, it seems that the SSIS engine will reset the type to a numeric type. It kept resetting mine. I went into my Excel file and changed the numbers to "Numbers stored as text" by placing a single quote in front of them. They are now read as text.
I also noticed that SSIS uses the first row to IGNORE what the programmer has indicated is the actual type of the data (I even told Excel to format the entire column as TEXT, but SSIS still used the data, which was a bunch of digits), and reset it. Once I fixed that by putting a single-quote in my Excel file in front of the number in the first row of data, I thought it would get it right, but no, there is additional work.
In fact, even though the SSIS External DataSource Column now has the type DT_WSTR, it will still read 43567192 as 4.35671E+007. So you have to go back into your Excel file and put single quotes in front of all the numbers.
Pretty LAME, Microsoft! But there's your solution. I have no idea what to do if the Excel file is not under your control.
I was looking for a solution for the similar issue, but didn't find anything on the internet. Although most of the found solutions work at design time, they don't work when you want to automate your SSIS package.
I resolved the issue and made it work by changing the properties of "Excel Source". By default the AccessMode property is set to OpenRowSet. If you change it to SQL Command, you can write your own SQL to convert any column as you wish.
For me SSIS was treating the NDCCode column as float, but I needed it as a string and so I used following SQL:
Select [Site], Cstr([NDCCode]) as NDCCode From [Sheet1$]
Excel source is SSIS behaves crazy. SSIS determines the type of data in a particualr column by reading first 10 rows.. hence the issue. If you have a text column with null values in first 10 roes, SSIS takes the data type as Int. With a bit of struggle, here is a workaround
Insert a dummy row (preferrably first row) in the worksheet. I prefer doing this thru a Script task, you may consider using some service to preprocess the file before SSIS connects to it
With the duummy row, you are sure that the datatypes will be set as you need
Read the data using Excel source and filter out the dummy row before you take it for further processing.
I know it is a bit shabby, but it works :)
I could fix this issue. while creating the SSIS package, I manually changed the specific column to text (Open the excel file select the column, right click on column, select format cells, in number tab select Text and save the excel).
Now create the SSIS package and test it. It works. Now try to use the excel file where this column was not set as text.
It worked for me and I could execute the package successfully.
This should be resolved simply, just untick the box "Frist row as column names" and all data will be collected as text data type. Only downside of this choice is that you have to manage the columns names from the auto names (column 1, 2 etc) and handle the first row which contains the column names.
I had trouble implementing the solution here - I could follow the instructions, but it only gave new errors.
I solved my conversion issues by using a Data Conversion entity. This can be found on the SSIS Toolbox under Data Flow Transformations. I placed the Data Conversion between my Excel Source and OLE DB Destination, linked Excel to Data C, Data C to OLE DB, double clicked Data C to bring up a list of the data columns. Gave the problem column a new Alias, and changed the Data Type column.
Lastly, in the Mappings of the OLE DB Destination, use the Alias column name, rather than the original Excel column name. Job done.
You can use a Data Conversion component to convert to the desired data types.