I am working on creating orders/receipts for a certain client that provides the data using Excel. However, it is not a common Excel table format but an actual form to be printed. I am having issues with the data mapper since it cannot be parsed correctly.
Is there a way to get the data so I can transform it to a simple XML?
I'm thinking if I can just get the data by declaring which column/row.
I can suggest you to use a custom transformer using Apache POI from XLS to a simple java POJO, then you can get your XML using the object-to-xml-transformer.
apache POI: https://poi.apache.org/spreadsheet/quick-guide.html#Iterator
object-to-xml-transformer:
https://docs.mulesoft.com/mule-user-guide/v/3.6/xmlobject-transformers
Related
My requirement is to parse SEC tabular data. Please find the sample tabular data in the below image.
I'm using Python for it. I found that the tabular data is being stored in XBRL format. In the beginning, I tried to parse the XBRL data as the way we parse XML using the lxml module. Later I realized that it's a complex model to parse and we have many libraries for parsing XBRL document. I've gone through different libraries like python-xbrl, xbrl, and, installed servers(raptorXMLXBRL server) for parsing XBRL documents. But none worked as expected. As I mentioned earlier, my goal is to get the tabular data from the SEC. WE can find sample documents in this link. Can you please suggest me a process/module for parsing the tabular data. Thanks in advance.
Like you, I tried parsing xbrl documents using whatever tools are available in python - without much success. So one way to work around the problem is to get to the html filing underlying the xbrl filing.
So, to use your example link, the url of the first 10K there is
https://www.sec.gov/ix?doc=/Archives/edgar/data/1551152/000155115220000007/abbv-20191231x10k.htm
Simply strip the /ix?doc= string from the url, and you are left with
https://www.sec.gov/Archives/edgar/data/1551152/000155115220000007/abbv-20191231x10k.htm
which is the same 10k filing, but in html format. From there you can just use your normal html tools to extract whatever data you are interested in.
How to read a tree based xls using a Java API like Apache POI, etc. I want to read a xls which is in a tree based format i.e the data is grouped.
I use an excel-based automation framework where objects' names are parameterized into the excel sheet that drives test execution.
I need to import the QTP Object Repository to Excel/Spreadsheet in a simple readable format so that I can write a macro to fetch the objects' logical name alone into the excel sheet.
Is this possible? If so please explain.
(I understand that we have the option to import in XML format, but that is not helping much.)
You cannot export the Object Repository to Excel directly. You can only export it to XML. If you do not find the XML format useful, you will have to determine what is useful for you. You could take the XML file and convert it to different formats using XML Stylesheets (XSL). You could write a script that would parse the XML nodes for the test objects and just output the names. There are many options available once you have the data in a standard format like XML.
If you need more assistance, I suggest you post a sample of an Object Repository structure you want to export to Excel, and then post a sample of how you want that data presented in Excel.
I need to create a LineChart and want to supply the data points via a file. I.e. the numerical values are in a text file. Is this native functionality or do I have to do it the old fashioned way?
There is no such native functionality (I think because it's hard to guess desired format -- csv, xml, json, etc). So you just need to implement simple file reader which will add Chart.Data for each read line.
I am using the OpenXML SDK to manipulate my Excel files.
I'd like to store some custom XML data in the xlsx file (not in the sheet cells) in a way that survives a roundtrip through Excel.
Is it possible to do this with the CustomXmlPart class? Or some other class? If so, how?
CustomXmlParts are one way to do it. But here's another set of ways: How can I embed any file type into Microsoft Word using OpenXml 2.0 (Word/Excel = same concept)