How to extract attachements from a PDF in nodejs - node.js

I have limited knowledge in handling PDF:s and I need to extract an attached file from a PDF.

Related

Node js - converting pdf to valid version

I have various pdf files which fail a certain logic process due to them being invalid.
I use - https://www.pdf-online.com/osa/validate.aspx
and when I validate a pdf, I get a message that says the pdf does not conform to the PDF 1.3 standard or 1.4 standard.
I'm familiar with converting the pdf to text/json/buffer and then rebuild it and save it as a new pdf file, but was wondering is there an alternative? Because each pdf is different and is basically user input and the rebuilding it using jspdf for example, will be different for every file.
Is it possible to convert such pdf document to conform to the PDF 1.3/1.4 standards?

Extract embedded pdf from word document(docx) file

Able to extract the embedded images using [XWPF] (https://poi.apache.org/apidocs/dev/org/apache/poi/xwpf/usermodel/XWPFDocument.html). Unable to extract embedded pdf from docx file.
Can anyone please suggest something on this?

Excel file (.xlsx) to PDF Conversion using microsoft graph api is ignoring page setup instructions

I already have excel file in .xlsx format.I am trying to convert to pdf using microsoft graph api (by uploading the file to one drive and then downloading it as pdf). I am using the following API call
https://graph.microsoft.com/v1.0/me/drive/items/[item-id]/content?format=pdf
I see that the pdf conversion process in above API doesn't consider all the page setup parameters that are set in the underlying .xlsx file. More specifically, I see that converted pdf is always rendered in landscape mode and seems to be ignoring fit to width/height/page settings. If I open the same excel file locally using Excel and save the document as pdf, it renders the document correctly by interpreting all the page setup parameters properly.
Any help would be greatly appreciated as to how I can get pdf conversion API to render pdf as per orientation(portrait/landscape) and page width/height settings on the .xlsx file
I have tried multiple smaller files with different page setup parameters but pdf conversion (using rest api) always returns the document in landscape mode and seems to be ignoring fit to page/width/height settings

PDF form to HTML conversion in angular 2?

In my application I am uploading a PDF file after uploading, I should display the information present in PDF file to a HTML form we are using angular 2 for frontend and node js for backend. Can any one help me with this.
Please remember PDF to HTML.
You can do one thing. Convert your pdf to a JSON. Use pdf2json.
pdf2json is a node.js module that parses and converts PDF from binary to json format, it's built with pdf.js and extends it with
interactive form elements and text content parsing outside browser.
The goal is to enable server side PDF parsing with interactive form
elements when wrapped in web service, and also enable parsing local
PDF to json file when using as a command line utility.
perform npm install pdf2json
Create an empty JSON whose key values will be the main headings from the pdf like a customer, age etc. Its values are obtained from the uploaded pdf.
Using this JSON values fill your form, on saving the form using, node.js save it to your DB. Is this what you want?
Simply what you need is to render a PDF in your application.
You could use this library ng2-pdf-viewer
Almost all the basic functionalities are available as properties to this component. You could manipulate it to your requirement.

How to send data into an OpenOffice word template from NodeJS

How can I get data passed from NodeJS into placeholders of a OpenOffice template file? Is there any npm packages available to parse an ODT template file so that I can print data into it?
I have a 13 page word file (a template for printing reports) and I want to populate it with certain details from the DB into the different pages of this file. I like to pass the data in JSON format.
What I know is how to write into a plain text/excel file from node, but I want to write into the placeholders of a word template without loosing other parts of the template. I did the same with VBScript (with microsoft word template) in the past. Now want to achieve the same using nodejs. Please share with me your ideas..thanks

Resources