I have recently been experimenting with perl and some modules to read Excel files and in particular the format of thier cells.
For example I wrote a piece of perl code that used the module ParseExcel to read a cells background colour. However while testing I noticed that for certain files the colour returned by my perl program did not match the colour reported by Excel. Eventually I found the reason for this was that the file I was reading was a .xls file saved in compatibility mode. Basically the creator of the file had used the functionality of Excel .xlsx type files (2007+) to colour some of the cells and then saved the file with the old .xls file extension that did not support the colours chosen.
So my question: Is there any way to tell whether a given .xls file (or any other old Excel file format) has been saved in compatibility mode without usung Excel to find out? The reason I ask is that I am working under a linux environment and can't use any windows tools to analyse the files.
Furthermore, if one could identify that a given Excel file has, indeed, been saved in compatibiity mode is there any way of knowing how the original colours were mapped to the ones that my program is telling me?
Many thanks for any help on this.
I do not think that you can do this using Spreadsheet::ParseExcel. I have tried saving an xls file with a color from an .xlsx and saving it with 2003 compatibility. Then comparing it with an empty .xls of 2003 and I do not see any difference in my files.
You can try the following code to debug it with your own files trying to find a difference that you could use:
use strict;
use warnings;
use Spreadsheet::ParseExcel;
use Data::Dumper;
use JSON;
use Test::More tests => 1;
my $file_1 = 'test_xls.xls';
my $file_2 = 'compat_xls.xls';
my #files = (
$file_1,
$file_2,
);
my #workbooks;
foreach my $file (#files){
print("\n\nReading $file\n");
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse($file);
# print Dumper($workbook->{PkgStr});
delete $workbook->{PkgStr};
delete $workbook->{File};
delete $workbook->{Worksheet}->[0]->{MinRow};
delete $workbook->{Worksheet}->[0]->{RowHeight};
delete $workbook->{Worksheet}->[0]->{_Pos};
delete $workbook->{Worksheet}->[0]->{MinCol};
delete $workbook->{Worksheet}->[0]->{MaxCol};
delete $workbook->{Worksheet}->[0]->{MaxRow};
delete $workbook->{Worksheet}->[0]->{Cells};
delete $workbook->{Format}->[62];
push #workbooks, $workbook;
}
my ($ok, $stack) = is_deeply($workbooks[0], $workbooks[1]);
my $diag = explain($stack);
print(Dumper($diag));
Related
matlab (2015b) in my new notebook ThinkPad function xlsread/ xlswrite not work
for every exist excel file, xlsread not load the data
xlswrite also not work in every path
error use xlsread (line251)
catch exception
if isempty(exception.identifier)
exception = MException('MATLAB:xlsreadold:FormatError','%s', exception.message);
end
throw(exception);
the method import data also not work for excel file。
I found this answer in
https://cn.mathworks.com/matlabcentral/answers/282688-why-my-excel-file-can-not-be-read-by-matlab hope it can help you:
Who has problem to read excel file, can follow this order.
1- open the excel> file, >option, >add in, manage then select COM ADD IN, and clear everything (unchecked). everything should be cleared (unchecked).
2- restart the PC, and open the matlab.
3- perform xlsread command.
NOTE: for those people who use foxit pdf reader, it is potential to face this problem, so follow mentioned order.
NOTE: sometimes by using the matlab, configuration of excel is changed in unknown way, therefore there is no way to open the usual excel file in windows by double click.
So, open excel from desktop icon, file> option,> advanced,> general and then make clear (unchecked) "the ignore applications that use dynamic data exchange (DDE)". (same information for NOTE 2: https://support.microsoft.com/en-us/kb/3001579) these are some error for excel worker with matlab and related command.
I'm processing a data set and running into a problem - although I xlswrite all the relevant output variables to a big Excel file that is timestamped, I don't save the code that actually generated that result. So if I try to recreate a certain set of results, I can't do it without relying on memory (which is obviously not a good plan). I'd like to know if there's a command(s) that will help me save the m-files used to generate the output Excel file, as well as the Excel file itself, in a folder I can name and timestamp so I don't have to do this manually.
In my perfect world I would run the master code file that calls 4 or 5 other function m-files, then all those m-files would be saved along with the Excel output to a folder names results_YYYYMMDDTIME. Does this functionality exist? I can't seem to find it.
There's no such functionality built in.
You could build a dependency tree of your main function by using depfun with mfilename.
depfun(mfilename()) will return a list of all functions/m-files that are called by the currently executing m-file.
This will include all files that come as MATLAB builtins, you might want to remove those (and only record the MATLAB version in your excel sheet).
As pseudocode:
% get all files:
dependencies = depfun(mfilename());
for all dependencies:
if not a matlab-builtin:
copyfile(dependency, your_folder)
As a "long term" solution you might want to check if using a version control system like subversion, mercurial (or one of many others) would be applicable in your case.
In larger projects this is preferred way to record the version of source code used to produce a certain result.
I have generated an excel file from xml. But i can not open it with Excel. Excel gives the following error opening it:
Problems came up in the following areas during load:
Table
Then it shows a message that the log file corresponding the error can be found at : C:/Documents and Setting/myUserName/Local Settings/Temporary Internet Files/Content.MSO/xxxxx.log
But i can not find Content.MSO folder in my windows. I checked folder settings and made all folders visible but i still can not access this folder. So that i can not analise the log file.
how could i find the generated log file?
I found the problem without analising the log file. i stil can not access the log file in temporary internet files. But i realised that i put a string(non-number) characters on a number-styled cell in Excel xml. So if you having the similar issues about your Excel file generated from xml, then have a look at if your cell values are appopriate with your cell data type.
If you type or paste the path of the log file into Explorer or your text editor of choice, you may find that the folder does exist, despite being invisible.
In my case it was a <Row> with an incorrect ss:Index
I was using a template and the last row had a fixed Index=100. If the number of rows I added exceeded 100, this last row had a wrong index and excel threw the error without any other message or log (MacOSX, Excel 15.25.1). I wish they printed more informative error messages, what a waste of our time.
Excel 2016. My error message was "Worksheet Settings". Path was pointing to non-existing file.
My cause of the problem was ExpandedRowCount not big enough for number of rows in Worksheet. If you add rows in XML directly (i.e. on a machine where Excel is not installed), make sure to increment number of rows in ExpandedRowCount.
yes.Even i too faced the same problem and problem was with the data type of cells ofexcel generated using xslt
In addition to checking the data being used vs "Type" assigned, make sure that the list of characters that need to be encoded for XML are indeed encoded.
I had a system that appeared to be working, but then some user data including & and < was throwing this error.
If you're not sure what's going on with your file, try http://www.xmlvalidation.com/ - that helped be spot the issue in a large file immediately.
I used this function to fix it, modified from this post:
function xmlsafe($s) {
return str_replace(array('&','>','<','"'), array('&','>','<','"'), $s);
}
and then run echo xmlsafe($myvalue) where you were just echoing $myvalue in your script.
This seems to be more appropriate for XML than htmlentities() or other options built into PHP.
I had the same issue, and the answer was - type of Cell was Number and some values doesn't converts to this type on my backend.
I had the SAME problem,
and its because de file is TOO BIG.
I try an extract from SAP, more little than the one with that make the error) and save it in XML file. and it WORK, no more error.
so maybe if you can save in 2 Excel files XML instead of 1 it will be good ;)
ALicia
I'm trying write a simple perl script that reads some fields from a password protected XSLX file.
I've looked at Spreadsheet::XLSX and SimpleXlsx but neither seem to support password protected files.
Any idea how this can be done?
Using Win32::OLE
This is done like so:
my $Book =
$Excel->Workbooks->Open( { FileName => $file, Password => $password } );
None of the current Perl xlsx reading modules support reading encrypted files.
It isn't straightforward to decrypt these files since the encrypted XML files are stored in an OLE container document as opposed to the usual ZIP container.
This "should" be doable with OpenOffice/LibreOffice. There seem to be quite a few bugs around xlsx and encrypted file support, not to mention the combination, so I'd try opening the files in LibreOffice GUI first and if that works for your specific files, call it via library or command line.
OpenOffice::OODOC is the Perl connector, if that doesn't work you can use the command line to convert to a non-password protected file and then open it in your tool of choice.
I have a requirement of importing information from an excel file to a database.
I have a webpage that runs an ssis package, that picks up an excel file, and loads data into a database. The problem now lies, in the different types of excel files to be processes either xls or xlxs. SSIS excel connection manager, lets you specify which type of excel file, you will be connecting to either xls or xlxs, you can not use one connection manager for both types, this now only allows the user to always change an xlxs file to xls, then process it, is there a way to dynamically change the connection manager, based on the type of excel file,
or should i just have two different SSIS packages called, when a different type is processed.
In SSIS 2008, you can set a Connection to a 2007 Excel file (.xlsx) and then use an Expression on the Connection Manager to set the ExcelFilePath to be the value of a variable. The value of this variable can be either type, 97-2003 (.xls) or 2007 (.xlsx) and the Excel Source will work, as long as the Sheet names are the same.
I'm not sure if this is the same behaviour in SSIS 2005.
If you are running the ssis package from code already, I would imagine this should be relatively easy to do. I have been fiddling around with editing packages from code over the last week or so and it is pretty easy to modify variables etc. I know you can also access the connections and specify a dtsConfig file
using (var p = app.LoadFromSqlServer(config.PackageName, config.SqlServerName, config.UserName, config.Password, null))
{
// changing variables in code
Variables vars = p.Variables;
vars["FromDate"].Value= criteria.From;
vars["ToDate"].Value = criteria.To;
// using a configfile in code
p.ImportConfigurationFile(config.ConfigurationFile);
DTSExecResult result = p.Execute();
if (result != DTSExecResult.Success)
{
throw new ApplicationException("SSIS Package did not compelte successfully.");
}
}
You could potentially have 2 different config files one for xlsx and on for xls connections and use the appropriate config file based on the uploaded excel files extentsion.