I use VBA in Excel to pull data from different sources (mostly .csv and .xls/.xlsx files) and paste them into my data tables (in the same Excel File I have a data table for each specific data source).
Each of those files comes with different settings. I have created an specific VBA Macro for each of my data sources to process, remove and copy the relevant information of each individual file, and then I call all of the Macros from another Macro. The problem I'm having is that for one of the data sources, when using the Workbooks.Open method, I have had to set the parameter Format to "Nothing" (Format:=5). But this affects then the subsequent macros and therefore the following files are not processed correctly.
I know I have two possibilities: Either I call this macro at the end, after I've processed all the other files or; I set the Format parameter in all of my Macros to the one specific for each of the files configuration. However there must be a way to simply reset the delimiter to the default one used in my Regional Settings. Does anyone knows a solution?
Sorry if there's already a thread with this issue but I've tried looking for it and didn't find any.
Thank you in advance.
Related
TL;DR: Excel Workbook generated by Docx4J always says corrupted but I can't determine what Excel doesn't like about the underlying XML, let alone how to fix it.
My use case is as follows:
I am trying to produce an excel workbook with charts and graphs automatically on a regular basis. Only the raw data will change but everything else will dynamically update as the raw data is changed.
So I built an excel workbook which has a number of charts and graphs being generated by a sheet of raw data. I am using it as a template. All values of the raw data are numeric. The intent was to use Docx4J to read this 'template' and to populate the raw data sheet, then save it as a new file whereupon opening will initiate the recalculation and the charts and graphs will update. Since I am new to Docx4j, I basically decided to do baby steps by first seeing if I could open and read the contents of the cells; which I could. So far so good. I also could change the values of the cells but I could only verify this programatically by writing out to the console the location and value before a change, then the location and value after the change (ex. A1=45 followed by A1=55).
My problem starts when I try to open the resulting file. It generates, looks to be about the right size but Excel claims it is corrupted. It does try to recover what it can, but ultimately fails and the workbook won't even open. For troubleshooting, I opened up the generated xlsx and confirmed all the various XML files that make up an xlsx file were present and readable so I am concluding either something is missing or some part of the XML coming out the other side is not what Excel wants. Further troubleshooting involved creating an empty workbook (no data, 1 sheet) as my 'template', opening it and then saving it back to the file system with a different name and simply trying to see if I could open it in Excel but no dice. This has me ruling out anything to do with my attempts to write or add data to the sheet.
Relevant Environment Information:
'template' workbook is being generated on a Windows 10 64bit machine
My docx4j code is executing on a Debian 10 Linux machine running OpenJDK 11.0.4
My version of Excel both to create the 'template' and open the copy is Excel for Office365
I am running Docx4J v11.1.3 but I also tried with v8.1.5(both cases I had to use the Reference Implementation of JAXB to get around a marshalling error when trying to save)
I did see another post on Stackoverflow here about an issue related to fonts in Linux environments so I made sure to install the MS TT Corefonts but it didn't help my problem.
I ran the entire unzipped directory through BeyondCompare and there are some differences but I don't know which are just artifacts of the two different OS' or even which differences matter. Mostly they are:
small differences in file size
boolean values showing as "1", "yes", or "true" but not the same way for both files
namespaces and attributes in one file but not the other
Sheet1 from my blank workbook, before and after
All ideas are welcome.
Please try the just-released docx4j 8.1.6, which fixes handling of xlsx files created by recent releases of Excel. This was https://github.com/plutext/docx4j/issues/389
I have about 10000 excel files, that in a specific cell of all of them there is a picture. I need a script to read all files and save the picture with the same name of the excel files in a folder.
Could you please help with that?
Thanks.
This method is based on a number of assumptions:
All the files (10000) are located in a know folder,
All files are named according to a paradigm that can be reproduced programmatically (if not, you can get the list of files within the folder, store the list within an array, and loop through the array),
Pictures are always within the same worksheet or, if in more than one, the names of the worksheets can be reproduced programmatically,
The filenames to be used to save the pictures can match (at least as a seed) the one of the Excel the pictures are extracted,
You will manage to write some basic VBA.
Note that for the VBA you have at least two options:
Write it within an EXCEL that will only serve as the extraction engine, or
Write it as a stand-alone file and run it via DOS commands.
The VBA logic:
Create the outer loop that processes a single file,
Within the outer loop, generate the name of a file to be open,
Open the file using Workbooks.Open VBA function,
Select the worksheet and the cell containing the picture,
Use the Workbook.SaveAs to save the picture (you will need to specify the type of file to be used, e.g. .bmp).
As a simple and very efficient tool to get the code (at least) partially generated by Excel, you can RECORD a MACRO for each action and then stop recording. You will see the code generated (you will need to access the VBA mode). You can copy-paste the generated code into your development (you might need to do some simple adaptations though).
That's it.
Hope you manage. Good luck!!
I am trying to develop a manner in VBA to track changes in a document without having to hide the contents in an extra sheet within a workbook.
I understand that if you change the extension of an Excel file to ".zip", you can access the Excel document as components sorted into directories. Is there a way to save and write to a text file within one of these directories so that I can access it every time the document is opened, without having to have the user drag a log file along with the Excel document?
Some facts:
When Excel opens the file, the file is blocked by Excel. There is no possibility to write to that file within VBA
You can store additional data into that file externally or after the Excel workbook has been closed
You would need to have code externally from the workbook to accomplish writing to that file after it has been closed. You may want to use VSTO or an oldschool Excel Addin.
you have to ensure that Excel will not destroy your changes when restructuring or repairing the file.
In the first run, your idea sounds very natural, to not use sheets from a programmer's point of view. You only have full control on Excel files when
you use external libraries (e.g. Spreadsheet Gear) or
you remote control Excel via automation.
you use openxml SDK for Excel
you use VBA
You could insert additonal information and take care that this information is not skipped by Excel.
When you want to do the tracking this way, I would suggest you to use an Excel Addin. There is actually no need for installation when using this kind of Addin. Attach to open workbook and close workbook events and ensure that all changes are written to the Excel Workbook after it has been closed. Certainly you would have to attach to all kind of other events to track all changes to the workbook. You may need to have in mind that there can be more than one workbook opened at a time.
Actually there are alternatives.
write your logging code in VBA or whatever fits
abstract away how your persist the code (e.g. use a data provider)
think about these two alternatives to store logging data:
You can save logging data in cells of excel. When using a "newer" version of excel, you have a limit of 1 million rows. You may want to implement a rolling mechanism that ensures that you never go over the border of 1 million records. (you may be dont want to track a million changes)
You can use the document properties to store you information as xml.
Last but not least, the most obvious: Why not using Excel's functionality of tracking changes? Understand track changes in Excel 2013
I'm currently writing a conversion function that takes data and creates an .xls file where part of the data becomes the sheet names.
My problem is, xlswrite automatically creates 3 default sheets with default names when it creates a new Excel file. Of course, these usually don't match the names in my data, so after my conversion is done, my Excel file looks almost fine, it simply has 3 leading sheets which are not supposed to be there.
Is there a way, without using ActiveX, to either stop xlswrite from creating those sheets in the first place, or delete them afterwards?
I just found out xlswrite actually uses AxtiveX internally, so the answer is
No, there is no way.
Just use ActiveX.
I made a copy of a template Excel file with a single named sheet from the program directory to the current directory, and then write to this file.
Use
fileparts(mfilename('fullpath'))
to get the path to the program file.
thanks for looking at this problem, I hope I can get some help, as I am not very experienced with VBA syntax in excel.
Background:
I will be receiving a large (1000's of lines) CSV file that will contain data entries of various lengths. Each line will begin with a code (eg, 01, 02,..., 50) and have a series of data entries following it based on that code.
So, for example
01,data,data,data
01,data,data,data
02,data,data,data,data
etc...
I need to import all of this data into an existing excel workbook that already has separate tabs and headers created to correspond with the data type.
What I believe needs to be done, is to import the csv to a new, blank sheet, then run a vba program to check the data code, and move the line to the corresponding tab. I would also like to preserve the formatting on the destination sheet.
Ultimately, what I think I need is a VBA program to read the code cell, and move the line to an existing tab based on that code, and loop through the whole column.
Most of the existing solutions I have found involve the creation of new tabs, but I wish to parse the raw data into existing tabs with headers and formatting. I am aware this may require me to manually type in the code and destination tab names in the program's logic - That will not be an issue as long as I have a base to start with!
Thanks again for your help, and let me know if I can provide any more information.