In a node.js app I want to generate pdf docs and send it back to the user. I would like to use Prawn PDF as I have used it before and am comfortable using it.
I suppose I should use node's child_process.spawn to call a ruby script (that returns a pdf) to achieve this but I do not know how to actually implement it!
Am doing this:
spawn = require('child_process').spawn;
pdf = spawn('my_ruby_script');
Now how do I get hold of the returned pdf doc?
Thanks,
mano
I ended up with this eventually:
var spawn = require('child_process').spawn;
var child = spawn('ruby', ['print_pdf.rb', doc_id]);
var pdf = '';
child.on('data', function(data){
pdf += data;
});
child.on('exit', function(code){
if(code == 0){
res.setHeader('Content-Type', 'application/pdf');
res.send(pdf);
}
});
The ruby prawn script generates the pdf and at the end just 'puts' the rendered pdf which is available to child as 'data'.
Related
I use an NPM library to parse markdown to HTML like this:
var Markdown = require('markdown-to-html').Markdown;
var md = new Markdown();
...
md.render('./test', opts, function(err) {
md.pipe(process.stdout)
});
This outputs the result to my terminal as intended.
However, I need the result inside the execution of my node program. I thought about writing the output stream to file and then reading it in at a later time but I can't figure out a way to write the output to a file instead.
I tried to play around var file = fs.createWriteStream('./test.html'); but the node.js streams rather give me headaches than results.
I've also looked into the library's repo and Markdown inherits from Readable via util like this:
var util = require('util');
var Readable = require('stream').Readable;
util.inherits(Markdown, Readable);
Any resources or advice would be highly appreciated. (I would also take another library for parsing the markdown, but this gave me the best results so far)
Actually creating a writable file-stream and piping the markdown to this stream should work just fine. Try it with:
const writeStream = fs.createWriteStream('./output.html');
md.render('./test', opts, function(err) {
md.pipe(writeStream)
});
// in case of errors you should handle them
writeStream.on('error', function (err) {
console.log(err);
});
I am attempting to run a script that is archived inside an ASAR file like so:
var spawn = require('child_process').spawn;
var t = spawn('node', ['./bundle.asar/main.js'], {});
t.on('data', function(data){
console.log(data.toString());
});
t.stdout.pipe(process.stdout);
t.stderr.pipe(process.stderr);
FYI the above script is situated outside the ASAR archive.
However, all I get is the following error:
Cannot find module 'C:\Users\MyUser\tests\asar-test\bundle.asar\main.js'
The official docs on this particular issue are nonexistent.
Is there some way to either read the ASAR file or require a script inside it?
Thank you.
For posterity, the answer to this is to require your .js file like so:
var spawn = require('child_process').spawn;
var myScript = require('./bundle.asar/main.js');
var t = spawn('node', [myScript], {});
t.on('data', function(data){
console.log(data.toString());
});
t.stdout.pipe(process.stdout);
t.stderr.pipe(process.stderr);
Hope this helps.
I found some repos, which do not look as they are still maintained:
https://github.com/gfloyd/node-unoconv
https://github.com/skmp/node-msoffice-pdf
...
I tried the approach with libreoffice, but the pdf output is so bad, that it is not useable (text on diff. pages etc.).
If possible I would like to avoid starting any background processes and/or saving the file on the server. Best would be solution where I can use buffers. For privacy reasons, I cannot use any external service.
doc buffer -> pdf buffer
Question:
How to convert docs to pdf in nodejs?
For those who might stumble on this question nowadays:
There is cool tool called Gotenberg — Docker-powered stateless API for converting HTML, Markdown and Office documents to PDF. It supports converting DOCs via unoconv.
And I am happen to be an author of JS/TS client for Gotenberg — gotenberg-js-client
I welcome you to use it :)
UPD:
Gotenberg has new website now — https://gotenberg.dev
While I was creating an application I need to convert the doc or docx file uploaded by a user into a pdf file for further analysis. I used npm package libreoffice-convert for this purpose. libreoffice-convert requires libreoffice to be installed on your Linux machine. Here is a sample code that I have used.
This code is written in javascript for nodejs based application.
const libre = require('libreoffice-convert');
const path = require('path');
const fs = require('fs').promises;
let lib_convert = promisify(libre.convert)
async function convert(name="myresume.docx") {
try {
let arr = name.split('.')
const enterPath = path.join(__dirname, `/public/Resume/${name}`);
const outputPath = path.join(__dirname, `/public/Resume/${arr[0]}.pdf`);
// Read file
let data = await fs.readFile(enterPath)
let done = await lib_convert(data, '.pdf', undefined)
await fs.writeFile(outputPath, done)
return { success: true, fileName: arr[0] };
} catch (err) {
console.log(err)
return { success: false }
}
}
You will get a very good quality of pdf.
To convert a document into PDF we can use Universal Office Converter (unoconv) command line utility.
It can be installed on your OS by any package manager e.g. To install it on ubuntu using apt-get
sudo apt-get install unoconv
As per documentation of unoconv
If you installed unoconv by hand, make sure you have the required LibreOffice or OpenOffice packages installed
Following example demonstrate how to invoke unoconv utility
unoconv -f pdf sample_document.py
It generates PDF document that contains content of sample_document.py
If you want to use a nodeJS program then you can invoke the command through child process
Find code below that demonstrates how to use child process for using the unoconv for creating PDF
const util = require('util');
const exec = util.promisify(require('child_process').exec);
async function createPDFExample() {
const { stdout, stderr } = await exec('unoconv -f pdf sample.js');
console.log('stdout:', stdout);
console.log('stderr:', stderr);
}
createPDFExample();
Posting a slightly modified version for excel, based upon the answer provided by #shubham singh. I tried it and it worked perfectly.
const fs = require('fs').promises;
const path = require('path');
const { promisify } = require('bluebird');
const libre = require('libreoffice-convert');
const libreConvert = promisify(libre.convert);
// get current working directory
let workDir = path.dirname(process.mainModule.filename)
// read excel file
let data = await fs.readFile(
`${workDir}/my_excel.xlsx`
);
// create pdf file from excel
let pdfFile = await libreConvert(data, '.pdf', undefined);
// write new pdf file to directory
await fs.writeFile(
`${workDir}/my_pdf.pdf`,
pdfFile
);
Docx to pdf
A library that converts docx file to pdf.
Installation:
npm install docx-pdf --save
Usage
var docxConverter = require('docx-pdf');
docxConverter('./input.docx','./output.pdf',function(err,result){
if(err){
console.log(err);
}
console.log('result'+result);
});
its basically docxConverter(inputPath,outPath,function(err,result){
if(err){
console.log(err);
}
console.log('result'+result);
});
Output should be output.pdf which will be produced on the output path your provided
I'm attempting to download a file using the http module in node. While the file seems to download sucessfully, the resultant file cannot be opened using gzip. I've tried downloading the file through other methods, and that works, and I've tried using multiple ways to open the resultant gzip'd file, but all of those produce the same error.
I did attempt to use the request module, but there seemed to be no way of accessing the returned HTTP headers before the file was finished downloading, which I need because I'd like to offer some sort of visual indicator as to how long this file is going to take to download.
This is (roughly) the code that I've got so far.
var http = require('http');
var fs = require('fs');
var progress = 0;
downloadFile = function() {
http.get(FILE_URL, function(response) {
var maxBytes = parseInt(response.headers['content-length'], 10);
var dumpFile = fs.createWriteStream(FILENAME + '.dl');
response.pipe(dumpFile);
response
.on('data', function(chunk) {
progress += chunk.length;
// progressbar-type code here
})
.on('end', function() {
// pass
})
dumpFile.on('finish', function() {
dumpFile.close();
fs.rename(FILENAME + '.dl', FILENAME);
});
}
So my question: How would you advise I download a file, bearing in mind it's a large file and I need some sort of visual indicator for download progress? Should I give up on http? Or am I doing something monumentally stupid?
Thanks!
I'm trying to create a node server that spawns phantomjs processes to create screenshots. The grab.js script works fine when executed and I've confirmed that it writes to stdout. Problem is the node code that spawns the process simply hangs. I've confirmed that phantomjs is in the path. Anyone know what might be happening here or how I might troubleshoot this?
Here's the phantomjs code (grab.js) that renders the page and writes the data to stdout:
var page = require('webpage').create(),
system = require('system'),
fs = require('fs');
var url = system.args[1] || 'google.com';
page.viewportSize = {
width: 1024,
height: 1200
};
page.open(url, function() {
var b64 = page.renderBase64('png');
fs.write('/dev/stdout', b64, 'w');
phantom.exit();
});
And here's the node code that spawns the phantom progress and prints the result (hangs):
var http = require('http'),
exec = require('child_process').exec,
fs = require('fs');
exec('phantomjs grab.js google.com', function(error, stdout, stderr) {
console.log(error, stdout, stderr);
});
I have had similar issues with exec and then switched to using spawn instead and it worked.
According to this article , Use spawn when you want the child process to return huge binary data to Node, use exec when you want the child process to return simple status messages.
hth
I had same problem, in my case it was not in nodejs, but in phantomjs (v2.1).
It's known problem when phantom`s open method hangs.
Also, found second link (I guess same person wrote) in which author points that requestAnimationFrame is not working well with tweenJs, which causes freezing. PhantomJS returns unixtimestamp but tweenjs expects it to be DOMHighResTimeStamp, and so on...
Trick is to inject request-animation-frame.js (which is also provided in that article)