What .xlsx file format is this? - excel

Using an existing SSIS package, I was trying to import .xlsx files we received from a client. I received the error message:
External table is not in the expected format
These files will open in XL
When I use XL (currently XL2010) to Save As... the file without making any changes:
The new file imports just fine
The new file is 330% the size of the original file
When changing .xlsx to .zip and investigating the contents with WinZip:
The original file only has 4 .xml files and a _rels folder (with 2 .rels files):
The new file has the expected .xlsx contents:
Does anyone know what kind of file this could be?
It would be nice to develop my SSIS package to work with these original files, without having to open and re-save each file. There are only 12 files, so if there are no other options, opening/saving each file is not that big of deal...and I could automate it with VBA going forward.
Thanks for any help anyone can provide,
CTB

There are many Excel file formats.
The file you are trying to import may have another excel format but the extension is changed to .xlsx (it could be edited by someone else) , or it could be created with a different Excel version.
There is a Third-Part application called TridNet File Identifier which is an utility designed to identify file types from their binary signatures. you can use it to specify the real extension of the specified file.
Also after a simple search on External table is not in the expected format this error is thrown when the definition (or version) of the excel files supported in the connection string is different from the file selected. Check the connection string used in the excel connection manager. It might help to identify the version of the file.

Related

Question mark on .xlsx file in InteliJ?

I am trying to pass the path of Excel file in InteliJ so I can manage data from there but it doesn'work.I tried everything...please help me !
I tried to save the Excel file on my Desktop,copy path and pass to src folder and resources directory and it didn't work.Then I tried to save the Excel file on MyOneDrive Personal Desktop and somehow I pass the file but it has question mark in InteliJ....What to do?

Not able to read .xlsb file or .xlsx (large files - 150 MB) from shared drive using python

I am facing this problem where when I try to read the file directly from shared drive it's throwing invalid path error. Trying to explain the situation below:
The data files in the form of .xlsx and .xlsb is copied to the sharepoint, which works as the source.
I used 'open in explorer' function from sharepoint and got the drive address.
Mapped the path after opening in explorer with my network drive, and added as p drive.
Now i am using this path to read the file directly using pandas read_excel.
it is throwing invalid path OS22 error
Issues :
When i am reading .xlsx file which is smaller in size 15MB, it is working well.
Trying to read another excel file 150 MB in size, getting invalid path error.
Same is happening when reading .xlsb binary files.
Already tried forward and back slashes, same error.
used open to read the file, got same invalid path error.
Though if i download the same file to local, it is working without any issue. Easily able to read the files, with same codes.
Any suggestion?

Talend 7.1 tFileOutputExcel corrupt file

I'm trying to output an excel file from Talend 7.1. I've tried a few different setups and both xls and xlsx formats but they all result in the output file being corrupt and not being able to open it.
What am I doing wrong? I am loading an xlsx file into a database and this part works fine but outputting to excel I just can't figure it out! I was writing from the tMap directly to the tFileOutputExcel and it wasn't working (corrupt) so I changed it to write to a csv file first and then write that csv to the tFileOutputExcel but it is still corrupt.
This is my job detail:
And this is the settings in the tFileOutputExcel
I got this working by changing the transfer mode in the FTP component from 'ascii' to 'binary'. Such a simple thing but if this helps anyone else with this issue who is a newb like me :)

Excel file getting corrupted while downloading using PowerShell/Curl

I am trying to download an Excel file from Artifactory to my local machine using a PowerShell/Curl script. The Excel file is getting created, but when I open a sheet I am getting the below message and there is no data in this.
The file format and extension of 'DataSheet_DEP.xls' don't match. The file could be corrupted or unsafe. Unless you trust its source, don't open it. Do you want to open it anyway?
curl.exe -k -u artifactory_TEMP:Password16 https://../artifactory/../DataSheet_DEP.xls --output D:\ABC\Team Members\Datasheet\DataSheet_DEP.xls
Just to add Excel file is in 97-2003 workbook (.xls). I am able to download the file manually without any issue.
This is resolved now.. I was using wrong artifactory path. Thanks everyone who tried to help here

Dynamic Source: Excel Connection SSIS

Here's what I'm doing:
I'm using a Foreach Loop container to grab any .xlsx files in a specified folder and assigning the fully qualified name to a variable called FileName.
Then I have a data flow with an Excel source importing to an OLE DB Destination.
How do I make the excel source the FileName variable?
--When I create the same process for flat files I have no problems creating an expression and changing the delay validation to true but when I try excel files it doesn't work the same. I've been able to work around the problem by using a file system task to move the xlsx files to a new folder giving it a static name and importing from that file, but I'm tired of doing that. Any help will be greatly appreciated!

Resources