I need to show character data in HTML files. It works fine when data is simple, but problem arises when data is similar to tags.
Let me describe my problem.
I am showing data coming from database tables to HTML files (I am creating table to show data).
Now if content in my table is like <img src ="445521.jpg"> it gives me error while parsing.
since it would try to search image in my system.
In XML, we have <![CDATA["content"]]> to rescue, but I dont know what to in HTML for this,
More over I am converting this HTML to PDF. It gives me error even converting to PDF.
Can anybody tell how to create html to make parser understand that the content is Character data ?
Thanks in anticipation.
Try HttpUtility.HtmlEncode (you'll have to import System.Web). This will convert the special characters to HTML entities (e.g. < → <).
Related
My requirement is to parse SEC tabular data. Please find the sample tabular data in the below image.
I'm using Python for it. I found that the tabular data is being stored in XBRL format. In the beginning, I tried to parse the XBRL data as the way we parse XML using the lxml module. Later I realized that it's a complex model to parse and we have many libraries for parsing XBRL document. I've gone through different libraries like python-xbrl, xbrl, and, installed servers(raptorXMLXBRL server) for parsing XBRL documents. But none worked as expected. As I mentioned earlier, my goal is to get the tabular data from the SEC. WE can find sample documents in this link. Can you please suggest me a process/module for parsing the tabular data. Thanks in advance.
Like you, I tried parsing xbrl documents using whatever tools are available in python - without much success. So one way to work around the problem is to get to the html filing underlying the xbrl filing.
So, to use your example link, the url of the first 10K there is
https://www.sec.gov/ix?doc=/Archives/edgar/data/1551152/000155115220000007/abbv-20191231x10k.htm
Simply strip the /ix?doc= string from the url, and you are left with
https://www.sec.gov/Archives/edgar/data/1551152/000155115220000007/abbv-20191231x10k.htm
which is the same 10k filing, but in html format. From there you can just use your normal html tools to extract whatever data you are interested in.
I'm trying to parse a XML file in order to retrieve the data in a list.
I need to extract the TITRE_N, the AUTEURS_N and the RESUME_N. I know how to do it but my problem is that for some reference, I don't have any data for AUTEURS_N. there is no tag and the result as you can think it that all the data after are shift! Do you know how I can parse this doc and handle the fact that sometimes I'm missing one tag that I usefully use?
thx a lot!
Does anybody knows how to create a structured report using dicom scope toolkit via console (ubuntu 16.04) with a link to a related image?
The thing is that I have an image of some kind of trauma and I have to connect with a report which is in a text file. The last file should be in .dcm format which contains annotation and a link to an image. I have to use dicom scope program.
Maybe others refrain from answering because your question needs a very long answer. I cannot provide step-by-step instructions, a few hints, though.
The way I would go is to:
(assuming that your image is available in DICOM format):
obtain a sample structured report. I think that the "simple" Basic Text SR is what you want to go for. You can find some samples here.
convert the SR to an XML file using dsr2xml
edit the contents in XML. Do not forget to include your image reference in (0040,a730) Content Sequence -> (0008,1199) Referenced SOP Sequence
convert the XML back to DICOM SR using xml2dsr
By the way: From your question, I did not really understand why you want to use a structured report, as you wrote that your report is plain text. Instead of digging into the complex structure of SR, you may want to consider exporting the report to an Encapsulated PDF document which can reference images as well.
I have a submit only XPage based form that has an inputRichText field for storing screenshots and a multi file upload (using the XPages Multiple File Uploader from OpenNTF) for uploading one or more attachments. When submitted I need both the screenshots and the attachments to appear in a single rich text field which will be accessed via the Notes Client only (non XPages).
Currently the form stores the attachments and screenshots in separate fields. I have tried appending one field to the other on save (using SSJS in the submit button, however because the Screenshots are stored as MIME and the attachments as NotesRichText, it is not letting me do it.
Is there some way (preferably in SSJS) that I can convert either the MIME to RichText or vice versa so that I can append one field to the other? I have tried searching for various solutions to no avail, as well as trying different file upload controls from OpenNTF.
Ideally I need something like this to work:
var rtItemAttachments:NotesRichTextItem = docTo_Backend.getFirstItem("attachments"); //This is the field I want everything in
var rtItemFiles:NotesRichTextItem = docTo_Backend.getFirstItem("uploadedFiles");
rtItemAttachments.appendRTItem(rtItemFiles); //Fails on this line
docTo_Backend.removeItem("uploadedFiles");
Speak after me: there is no RichText in the web, all there is is MIME.
You can set the RT field to store its content in MIME (a property). This makes things much easier.
To stitch things together you need to stick with MIME. These are roughly the steps
Get the text and images as MIME
Get your attachments as stream (the embeddedObjects has a method for that)
Convert the stream to BASE64 and create a new mime-part with it. (Looking at an attachment eMail source someone sent through the internet should give you a pretty good idea how it looks like)
You end up with:
MimeHeader
MimePart for Text (HTML)
MimePart for Screenshots (if they are not inline images in html)
MimeParts for attachments
The special effect: if you add to the HTML with links to the attachments, it looks nicer.
Of course the BIG question: WHY?
You could simply design a Notes form that has two fields, no need to fold it into one. Hope that helps.
A good piece of code to look at to understand the MIME stuff is the OpenNTF eMail bean
Requirement:
1.I need to export a table as a excel file.
2.I render it in a html page at first. I have a button to export to html.
My opinion:
1.I get the html from page:
document.getElementById('content').value = document.getElementById('containerId').innerHTML;
form1.submit();
2.I get it from server, response.ContentType = "application/vnd.ms-excel;" // it need the client has installed Microsoft Excel.
3.I got the right Excel file "XXXX.xls".
4.BUT BUT BUT, when I open it, it's alert a waring tell me like
"it's not the right format of Excel, are you confirm to open it?"
I'm feel sorry to see it.
So I want to import the HTML section into a Excel file, then response the right Excel file to USER-AGENT.
I have use the Aspose.Cells library in my project, I don't know how to use it to finish the task, Or is any other solution to solve it ?
If you need to parse html tags/portion to Excel spreadsheet using Aspose.Cells for .NET, you may use Cell.HtmlString attribute to set your desired html code segment in a cell, it will be parsed accordingly in the generated Excel file. Mind you, not all the html tags are supported at the moment.
Aspose.Cells for .NET also supports to convert an Excel file to Html file directly, see the documents on which file formats are supported for conversion, it may help you for your reference:
http://www.aspose.com/documentation/.net-components/aspose.cells-for-.net/opening-files.html
http://www.aspose.com/documentation/.net-components/aspose.cells-for-.net/saving-files.html
If you still have some issue/confusion, kindly give us details with your sample code using Aspose.Cells API, we can help you.