Better way to store html content in mongodb - node.js

Is there a way to store tag in DB without using HTML tags. In order words I dont want the string thats getting stored to be like
'<b>This</b> is a sample', instead it could be some kind of encoded format.
As of now I have a got a few findings like encoding the html content before storing it via some 3rd party library
https://github.com/mathiasbynens/he
However, is there any better approach to do the same without using 3rd party library?

Related

Azure search adding documents to index approaches

I am not sure if i am going to be able to describe this right but ill give it a go.
We are working on implementing Azure search. At the core level we have searchable PDF documents that we want the text of them added to the index so all of them are searchable.
The initial thought was to just submit that document to the index via the add document rest api. The thinking was that this would be the most simple and quickest path
to getting the text of that document into the index. We also considered using and indexer and just having all the Searchable PDF docs in a blob store and have the indexer
crawl those every 10-15 mins.
We also looked into (based on a recommendation) submitting a standalone JSON file with the text from the PDF in it. Submitting that to the index either via the same add document API or
placing that file in a blob store. Within the JSON document we would need to have document identifiers that provide the index with the location of the PDF so that when that text is found
via search, we can make that clickable and as a result open the PDF.
It seems to me that pushing in the json file with the document add api. Indexing that and when it is part of a search we can use the doc id to link back to it and open it.
For those of you that have used Azure search. How did you implement?
If you're totally sure that only pdf will live on this particular index, then the first approach is faster to implement, since the native indexer can be used for extract the content of the pdf document as well to push it to the index.
Both approaches will work, but for the second one, you would need to extract the pdf yourself using an external tool.

how to get and display photo from ldap

I'm using ldap3.
I can connect and read all attributes without any issue, but I don't know how to display the photo of the attribute thumbnailPhoto.
If I print(conn.entries[0].thumbnailPhoto) I get a bunch of binary values like b'\xff\xd8\xff\xe0\x00\x10JFIF.....'.
I have to display it on a bottle web page. So I have to put this value in a jpeg or png file.
How can I do that?
The easiest way is to save the raw byte value in a file and open it with a picture editor. The photo is probably a jpeg, but it can be in any format.
Have a look at my answer at Display thumbnailPhoto from Active Directory in PHP. It's especially for PHP but the concept is the same for Python.
basically it's about either using the base64 encoded raw-data as data-stream or actually using a temporary file that is serverd (or used to determine the mime-type)

Is it possible to add an image to a PDF without rendering the PDF?

I'm looking at adding an image to an existing PDF in Node.js. None of the PDF libraries I found appear to have the ability to modify an existing PDF though, so I'm planning on implementing it myself. I'm trying to figure out if it's too much work, as I can always do it server side using iTextPDF instead, but I'd prefer to do it in my app (Electron which uses Node.js).
If I just want to modify an existing PDF and add an image, will I have to write a complete rendering library or is PDF structured in such a way that I can write a very small parser that just gets the page I want and inserts an image using the correct format?
Specifically, I'm asking because I've previously looked into writing a text extraction library, put in order to get the position of text you have to render pretty much the entire PDF because of how positioning is handled. That's too much work to get around server side processing in this case.
To be clear, just asking if it's possible to do, not how to do it (don't want to be too broad, I'm sure I can figure that part out).
To perform a small manipulation of a PDF, you'll need to implement generalized reading, decompression, encryption and traversal of PDF data structures. Some of the thing you would need to handle include:
basic parsing of PDF syntax
indexing via the cross reference index, and/or cross reference index and object streams
objects (num, byte-string, hex string, dictionary, arrays, booleans...)
filters and variants (LZW, Flate, RunLength, Predictors)
encryption (RC4, AES, Custom security handlers)
page tree traversal
basic handling of page content streams
image handling
serialization, either rewriting of the entire PDF, or incremental updates to an existing PDF
Anything's possible, but realistically, you will need a PDF library or toolkit, client or server-side, to accomplish this.

Exporting data from JSon

We are creating a map application that has a list view of the resultant set using JSON and jQuery and presenting the result to the user. How can we give the user the ability to download the result as an Excel file/CSV file?
Assuming a web application...
Convert JSON to CSV
Present It as an HTTP response.
You mentioned JSON and jQuery. I assume this is being used on the client side? There are a number of utilities for making the JSON-to-CSV conversion depending on the language you would want to use. Here are some examples.
Javascript/jQuery
PHP
Python

Question of UIWebView and Core Data

I want to develop a news App such as Engadge etc. The news had loaded from the server, and now I'll save the news included body text and pictures into database(Core Data). Can UIWebView read the datas from Core Data directly, and shows in UIWebview?
Thanks.
Yes and no. You can store the HTML content in the database (CoreData), to show an article you use: loadHTMLString:baseURL: of UIWebView to show the textual content. The images however is probably best stored outside the database as file because you will have to point the image references in your HTML to an actual file.
You could store the images as BLOBs but then you need to pull those blobs and write as files later for UIWebView to be able to pick them up.
I think the easiest way is to store images as files and but place references to them inside CoreData. That way you can also delete them accordingly later on.
If by directly, you mean without any glue code, then no, not on iOS as of present.
You could however use Core data to store text and image objects as desired. It might not be the best idea to use a UIWebView, but to answer your question, it's definitely possible, and in fact quite easy to do so.

Resources