Reading MS Excel file from SQL Server 2005 - excel

I need to read a Microsoft Excel 2003 file (.xls) from a query in SQL Server 2005, and then insert some of that data into some tables. Reading the file and then using its data is not a problem in itself, but I found that, for a column, sometimes I get a NULL value instead of the value that's shown in the Excel file. To be more specific: This column is always just one character long, and it can contain any one digit from 0-9, or the letter 'K'. It's when the column contains 'K' that the query gives me a NULL value. My assumption is that, since the first few rows contain numbers as the values of this column, the query assumes they will always be numbers, and when it finds a letter it just turns it into NULL.
I tried changing the format of the cells in the Excel file to text, and using CAST and CONVERT (not at the same time) on the value to try to make it a varchar, but it does nothing.

That looks like an older OLE DB driver for Excel. Not that it doesn't work--you can still "query" the spreadsheet with it. Maybe try something newer:
SELECT * FROM
OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0 Xml;HDR=YES;Database=C:\File.xls',
'SELECT * FROM [Sheet1$]')
You'll need an updated ODBC driver on the SQL Server (make sure to get the appropriate 32 vs 64 bit version).

Related

Date format Mail Merge Word Excel

I'm using the Dutch version of Word and Excel 2016 to fill in data from an Excel table in a Word document using Mail Merge. When doing so, my dates are represented as a number. I tried using the \# format, both in English as in Dutch, but nothing is working. I checked the Excel file and the data is properly formatted as a date. So far, I tried the following formats in my Word document, including adding and removing spaces before and after the quotation marks:
{MERGEFIELD FieldName \# "dd-mm-jjjj"}
{MERGEFIELD FieldName \# "dd-MM-yyyy"}
{MERGEFIELD FieldName .\# "dd-MM-yyyy"} (adding the dot was only mentioned on one website)
I import the data using the 'Use an Existing List' and 'Insert Merge Fields' function in Word.
Does anyone know what I should change to get a proper Date format in my Word document?
FYI, other numbering formats are working fine.
If the dates are being represented as numbers, that means you have mixed data types in the Excel column.
By default, Word 2002 & later use the OLE DB provider to get records from the data source. Because the OLE DB provider is designed to return data in a way that is compatible with databases, it requires a specific data type for each field, and every record in that field must be of that data type. When using other data sources, the OLE DB provider queries the first 8 records to determine the data type for each field (the 8 can be changed in the Windows Registry, but it’s not advisable to do so). This can lead to unexpected results with data sources such as Excel workbooks, where rows (records) in a column (field) can have different data types.
When the OLE DB provider gets data from a column with mixed data types, records that don’t conform to the determined data type for the column are liable to not be handled correctly. The most common common mailmerge issue arising out of this include:
numbers but not text or dates being output; and
dates being output as numbers,
for some records.
Ideally, one would ensure each field has only one data type. Workarounds include:
Inserting a dummy first record containing data in the format that
is not being output correctly; or
Reordering the data so the first
record has content in the format that is not otherwise being output
correctly.
If you're unable to do either, see Importing Date and Time Values From Excel and Access in my Microsoft Word Date Calculation Tutorial, avialble at:
http://www.msofficeforums.com/word/38719-microsoft-word-date-calculation-tutorial.html
or:
http://www.gmayor.com/downloads.htm#Third_party
Do read the document's introductory material.

Excel copy-paste to SQL Server sets BIT type to null

I am copying some table data from one database to another (identical table schema, table is empty in one database, I am copying all the rows from the other database).
I'm doing this by running a select in SQL Server Management Studio, copying the results to Excel to inspect, then copying from Excel to SSMS' 'Edit Top 200 Rows'.
The problem is that some columns have BIT type. The values display as 1/0 in SSMS select and as 1/0 in Excel. But after pasting in to SSMS, the values all become null.
BIT type will display as 1/0 but will not parse 1/0 as input? Does it mean BIT type is effectively not supported for copy-paste operation?
SSMS is version 10, Excel version 2003 SP3.
Apparently the BIT values need to be True and False to parse correctly:
http://connect.microsoft.com/SQLServer/feedback/details/330293/1-and-0-not-recognised-as-boolean
It does seem not very consistent or user friendly that the values display as 1 / 0 but will only parse from True / False.

Convert string column to numeric in Excel via connection string

I'm having a big problem getting data from Excel files via connection String.
I connect to xls file and execute this query:
SELECT CDbl(COLUMN_NAME) FROM [SHEETNAME$]
when COLUMN_NAME references a string column (or numeric column with empty cells), it fails. There is any solution? like "ISNULL(COLUMN_NAME, 0)" or someting like that???
Thanks!
Add the property IMEX=1 at the end of your connection string of the Excel connection manager.
Samples:
http://www.connectionstrings.com/excel
This will treat mixed data types as string. However, Excel will only scan the first 8 rows to determine if there are intermixed data types. In order to change that, you need to modify the TypeGuessRows registry setting for the JET provider. If you set it to 0, it will scan all rows.
Here are more references:
http://www.sql-server-helper.com/tips/read-import-excel-file-p02.aspx
http://munishbansal.wordpress.com/2009/12/15/importing-data-from-excel-having-mixed-data-types-in-a-column-ssis/

SSIS not importing TEXT column from Excel correctly (integer results in NULL value)

I have Excel 2003 files which are imported through SSIS into SQL 2008 R2. With one of the columns I hit a big problem. The column is defined as TEXT in the Excel sheet. Out of 36 rows 32 are having values like XTZ23, they get import correctly. The last 4 rows however
are just numbers like 2646672. They are imported as NULL. If I change the connection String to IMEX=1 and modify the registry to TypeGuessRow=0 these numbers end up like 2.64667e+006.
What did I miss here?
I know this is an old post, but for future searchers, just add IMEX=1 into the connectionstring of your Excel manager in the SSIS.
First solution would be to change excel column format if possible.
Second, I have had this problem 2 years ago, excel file couldn't be changed since I was getting it from another service ... I can`t remmember correctly but I have called custom code/function or it was some sort of transformation inside SSIS that was converting specific column rows from one data type to another.

SSIS Excel Data Source - Is it possible to override column data types?

When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?
Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.
Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.
According to this blog post, the problem is that the SSIS Excel driver determines the data type for each column based on reading values of the first 8 rows:
If the top 8 records contain equal number of numeric and character types – then the priority is numeric
If the majority of top 8 records are numeric then it assigns the data type as numeric and all character values are read as NULLs
If the majority of top 8 records are of character type then it assigns the data type as string and all numeric values are read as
NULLs
The post outlines two things you can do to fix this:
First, add IMEX=1 to the end of your Excel driver connection string. This will allow Excel to read the values as Unicode. However, this is not sufficient if the data in the first 8 rows are numeric.
In the registry, change the value for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Nod\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows to 0. This will ensure that the driver looks at all the rows to determine the data type for the column.
Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.
To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.
A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of
You have more control over what you convert to.
You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)
If your Excel file contains a number in the column in question in the first row of data, it seems that the SSIS engine will reset the type to a numeric type. It kept resetting mine. I went into my Excel file and changed the numbers to "Numbers stored as text" by placing a single quote in front of them. They are now read as text.
I also noticed that SSIS uses the first row to IGNORE what the programmer has indicated is the actual type of the data (I even told Excel to format the entire column as TEXT, but SSIS still used the data, which was a bunch of digits), and reset it. Once I fixed that by putting a single-quote in my Excel file in front of the number in the first row of data, I thought it would get it right, but no, there is additional work.
In fact, even though the SSIS External DataSource Column now has the type DT_WSTR, it will still read 43567192 as 4.35671E+007. So you have to go back into your Excel file and put single quotes in front of all the numbers.
Pretty LAME, Microsoft! But there's your solution. I have no idea what to do if the Excel file is not under your control.
I was looking for a solution for the similar issue, but didn't find anything on the internet. Although most of the found solutions work at design time, they don't work when you want to automate your SSIS package.
I resolved the issue and made it work by changing the properties of "Excel Source". By default the AccessMode property is set to OpenRowSet. If you change it to SQL Command, you can write your own SQL to convert any column as you wish.
For me SSIS was treating the NDCCode column as float, but I needed it as a string and so I used following SQL:
Select [Site], Cstr([NDCCode]) as NDCCode From [Sheet1$]
Excel source is SSIS behaves crazy. SSIS determines the type of data in a particualr column by reading first 10 rows.. hence the issue. If you have a text column with null values in first 10 roes, SSIS takes the data type as Int. With a bit of struggle, here is a workaround
Insert a dummy row (preferrably first row) in the worksheet. I prefer doing this thru a Script task, you may consider using some service to preprocess the file before SSIS connects to it
With the duummy row, you are sure that the datatypes will be set as you need
Read the data using Excel source and filter out the dummy row before you take it for further processing.
I know it is a bit shabby, but it works :)
I could fix this issue. while creating the SSIS package, I manually changed the specific column to text (Open the excel file select the column, right click on column, select format cells, in number tab select Text and save the excel).
Now create the SSIS package and test it. It works. Now try to use the excel file where this column was not set as text.
It worked for me and I could execute the package successfully.
This should be resolved simply, just untick the box "Frist row as column names" and all data will be collected as text data type. Only downside of this choice is that you have to manage the columns names from the auto names (column 1, 2 etc) and handle the first row which contains the column names.
I had trouble implementing the solution here - I could follow the instructions, but it only gave new errors.
I solved my conversion issues by using a Data Conversion entity. This can be found on the SSIS Toolbox under Data Flow Transformations. I placed the Data Conversion between my Excel Source and OLE DB Destination, linked Excel to Data C, Data C to OLE DB, double clicked Data C to bring up a list of the data columns. Gave the problem column a new Alias, and changed the Data Type column.
Lastly, in the Mappings of the OLE DB Destination, use the Alias column name, rather than the original Excel column name. Job done.
You can use a Data Conversion component to convert to the desired data types.

Resources