I have a problem with Excel (2016)/ VBA macros. I am happy to provide more detail or copies of files if required.
In summary this is what is happening:
I have a workbook with five worksheets. Three contain data representing both axes of a matrix (1&2), the cells of the matrix (3), one sheet on which data from sheets 1 to 3 are brought together (4) and - finally - a graph worksheet (5 - based on a range in 4).
The reason I do it like this is that I want to graph parts of the data at a time otherwise it is all too confusing. A simple form is used to choose which column of the data-set to display and to also allow stepping through the list of items to display so just part of the range is in view at a time. Variables are set in cells on sheet 4 (outside the graphed range) when the buttons are clicked on the form. The formulae that populate the graphed range on that sheet use these variables to choose the required data from sheets 1 to 3.
Initially I populated the data in sheets 1 to 3 using macros to read from text files. These files had to be populated from a database before opening the Excel workbook. I decided to "Simplify" this process by changing the macros to query the database directly.
It took a while to get this going and all looked fine. I can see the data in the first four worksheets is populated correctly (including being able to click on the form buttons and change the contents of the graphing range) but the graph sees values of zero in every cell. When I right click on the graph and select "Select Data" and then choose an item in the "Legend Entries (Series)" list in the left-hand box, and the click "Edit", I see the cell ranges of both the Series name and the Series Values are correctly displayed, the preview of "Name" has the correct value but the preview of "Series" shows a string of comma-separated zeroes.
I can switch worksheets to view the "GraphData" on sheet 4 while still keeping the dialogue box open and see the actual cells that are being referred to and they are not 0, 0, 0... they are 78, 69, 44...
Where are the zeroes coming from?
I have even saved the workbook - keeping changes - so the contents of sheets 1 to 3 are kept, put a stop in the "Workbook_Open" macro so it doesn't run, and still I get zeroes when I re-open it.
Sorry to bother folk. I worked out the problem. As usual, there were interesting lessons along the way.
I started out with a query that populated a text file with data from my database. Because I wanted the file to be ingested by Excel as a csv file, the output for each row of the selected data was a single string column containing the keys and the data values with commas between them, the numbers being converted to VARCHAR within the sql so I could concatenate them into the string.
Them when I decided to try using the query directly from Excel rather than having the file step in between, I removed the concatenation operators and the commas and put a comma between them in the sql syntax but didn't remove the conversion to VARCHAR of the numbers.
With QueryTable, when you import a text file, one of the optional properties is an array of codes representing the data types of the incoming data. I assumed that with a direct query, Excel would assign a data type based on content as it does for typed values (unless you specify another). Clearly Excel and the source database exchange information, in addition to the data set itself, that tells Excel the data type of the columns - as sent.
The graphing function was seeing my data as strings so assigned values of zero. I began by putting a Value() around my lookup functions on the GraphData worksheet. That worked but then I went further back and actually changed the Sql so it sent numeric data without conversion to VARCHAR and then it worked without the Value() change.
It never occurred to me that Excel wouldn't just see a number and treat it as such - not that Sql Server would have told Excel it was a string.
What is difference between value with warning "The number in this cell is formatted as text or preceded by an apostrophe" and value without this warning? Excel source data
Problem is when the data are imported in SSIS and rows until 2207 are imported correct but from 2208 are just NULL
Simple example SSIS diagram and the data from viewer while debuging
I use such ConnectionString:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Filename.xlsm;Extended
Properties="EXCEL 12.0 MACRO;HDR=NO;IMEX=2";
It doesn't matter if I use
"IMEX=1" or "IMEX=2"
Columns A..P are (DT_WSTR,255) type.
How to ensure that imported values were the same like in source Excel file (if possible without changing Excel source file)?
Try adding a derived column node and convert the columns to text (versus the wstr). Make sure you map the derived column in your output. If text doesn't work try int or numeric.
I need to read a Microsoft Excel 2003 file (.xls) from a query in SQL Server 2005, and then insert some of that data into some tables. Reading the file and then using its data is not a problem in itself, but I found that, for a column, sometimes I get a NULL value instead of the value that's shown in the Excel file. To be more specific: This column is always just one character long, and it can contain any one digit from 0-9, or the letter 'K'. It's when the column contains 'K' that the query gives me a NULL value. My assumption is that, since the first few rows contain numbers as the values of this column, the query assumes they will always be numbers, and when it finds a letter it just turns it into NULL.
I tried changing the format of the cells in the Excel file to text, and using CAST and CONVERT (not at the same time) on the value to try to make it a varchar, but it does nothing.
That looks like an older OLE DB driver for Excel. Not that it doesn't work--you can still "query" the spreadsheet with it. Maybe try something newer:
SELECT * FROM
OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0 Xml;HDR=YES;Database=C:\File.xls',
'SELECT * FROM [Sheet1$]')
You'll need an updated ODBC driver on the SQL Server (make sure to get the appropriate 32 vs 64 bit version).
I have Excel 2003 files which are imported through SSIS into SQL 2008 R2. With one of the columns I hit a big problem. The column is defined as TEXT in the Excel sheet. Out of 36 rows 32 are having values like XTZ23, they get import correctly. The last 4 rows however
are just numbers like 2646672. They are imported as NULL. If I change the connection String to IMEX=1 and modify the registry to TypeGuessRow=0 these numbers end up like 2.64667e+006.
What did I miss here?
I know this is an old post, but for future searchers, just add IMEX=1 into the connectionstring of your Excel manager in the SSIS.
First solution would be to change excel column format if possible.
Second, I have had this problem 2 years ago, excel file couldn't be changed since I was getting it from another service ... I can`t remmember correctly but I have called custom code/function or it was some sort of transformation inside SSIS that was converting specific column rows from one data type to another.
When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?
Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.
Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.
According to this blog post, the problem is that the SSIS Excel driver determines the data type for each column based on reading values of the first 8 rows:
If the top 8 records contain equal number of numeric and character types – then the priority is numeric
If the majority of top 8 records are numeric then it assigns the data type as numeric and all character values are read as NULLs
If the majority of top 8 records are of character type then it assigns the data type as string and all numeric values are read as
NULLs
The post outlines two things you can do to fix this:
First, add IMEX=1 to the end of your Excel driver connection string. This will allow Excel to read the values as Unicode. However, this is not sufficient if the data in the first 8 rows are numeric.
In the registry, change the value for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Nod\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows to 0. This will ensure that the driver looks at all the rows to determine the data type for the column.
Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.
To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.
A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of
You have more control over what you convert to.
You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)
If your Excel file contains a number in the column in question in the first row of data, it seems that the SSIS engine will reset the type to a numeric type. It kept resetting mine. I went into my Excel file and changed the numbers to "Numbers stored as text" by placing a single quote in front of them. They are now read as text.
I also noticed that SSIS uses the first row to IGNORE what the programmer has indicated is the actual type of the data (I even told Excel to format the entire column as TEXT, but SSIS still used the data, which was a bunch of digits), and reset it. Once I fixed that by putting a single-quote in my Excel file in front of the number in the first row of data, I thought it would get it right, but no, there is additional work.
In fact, even though the SSIS External DataSource Column now has the type DT_WSTR, it will still read 43567192 as 4.35671E+007. So you have to go back into your Excel file and put single quotes in front of all the numbers.
Pretty LAME, Microsoft! But there's your solution. I have no idea what to do if the Excel file is not under your control.
I was looking for a solution for the similar issue, but didn't find anything on the internet. Although most of the found solutions work at design time, they don't work when you want to automate your SSIS package.
I resolved the issue and made it work by changing the properties of "Excel Source". By default the AccessMode property is set to OpenRowSet. If you change it to SQL Command, you can write your own SQL to convert any column as you wish.
For me SSIS was treating the NDCCode column as float, but I needed it as a string and so I used following SQL:
Select [Site], Cstr([NDCCode]) as NDCCode From [Sheet1$]
Excel source is SSIS behaves crazy. SSIS determines the type of data in a particualr column by reading first 10 rows.. hence the issue. If you have a text column with null values in first 10 roes, SSIS takes the data type as Int. With a bit of struggle, here is a workaround
Insert a dummy row (preferrably first row) in the worksheet. I prefer doing this thru a Script task, you may consider using some service to preprocess the file before SSIS connects to it
With the duummy row, you are sure that the datatypes will be set as you need
Read the data using Excel source and filter out the dummy row before you take it for further processing.
I know it is a bit shabby, but it works :)
I could fix this issue. while creating the SSIS package, I manually changed the specific column to text (Open the excel file select the column, right click on column, select format cells, in number tab select Text and save the excel).
Now create the SSIS package and test it. It works. Now try to use the excel file where this column was not set as text.
It worked for me and I could execute the package successfully.
This should be resolved simply, just untick the box "Frist row as column names" and all data will be collected as text data type. Only downside of this choice is that you have to manage the columns names from the auto names (column 1, 2 etc) and handle the first row which contains the column names.
I had trouble implementing the solution here - I could follow the instructions, but it only gave new errors.
I solved my conversion issues by using a Data Conversion entity. This can be found on the SSIS Toolbox under Data Flow Transformations. I placed the Data Conversion between my Excel Source and OLE DB Destination, linked Excel to Data C, Data C to OLE DB, double clicked Data C to bring up a list of the data columns. Gave the problem column a new Alias, and changed the Data Type column.
Lastly, in the Mappings of the OLE DB Destination, use the Alias column name, rather than the original Excel column name. Job done.
You can use a Data Conversion component to convert to the desired data types.