Nodejs: Convert Doc to PDF

Nodejs: Convert Doc to PDF - node.js

I found some repos, which do not look as they are still maintained:
https://github.com/gfloyd/node-unoconv
https://github.com/skmp/node-msoffice-pdf
...
I tried the approach with libreoffice, but the pdf output is so bad, that it is not useable (text on diff. pages etc.).
If possible I would like to avoid starting any background processes and/or saving the file on the server. Best would be solution where I can use buffers. For privacy reasons, I cannot use any external service.
doc buffer -> pdf buffer
Question:
How to convert docs to pdf in nodejs?

For those who might stumble on this question nowadays:
There is cool tool called Gotenberg — Docker-powered stateless API for converting HTML, Markdown and Office documents to PDF. It supports converting DOCs via unoconv.
And I am happen to be an author of JS/TS client for Gotenberg — gotenberg-js-client
I welcome you to use it :)
UPD:
Gotenberg has new website now — https://gotenberg.dev

While I was creating an application I need to convert the doc or docx file uploaded by a user into a pdf file for further analysis. I used npm package libreoffice-convert for this purpose. libreoffice-convert requires libreoffice to be installed on your Linux machine. Here is a sample code that I have used.
This code is written in javascript for nodejs based application.
const libre = require('libreoffice-convert');
const path = require('path');
const fs = require('fs').promises;
let lib_convert = promisify(libre.convert)
async function convert(name="myresume.docx") {
try {
let arr = name.split('.')
const enterPath = path.join(__dirname, `/public/Resume/${name}`);
const outputPath = path.join(__dirname, `/public/Resume/${arr[0]}.pdf`);
// Read file
let data = await fs.readFile(enterPath)
let done = await lib_convert(data, '.pdf', undefined)
await fs.writeFile(outputPath, done)
return { success: true, fileName: arr[0] };
} catch (err) {
console.log(err)
return { success: false }
}
}
You will get a very good quality of pdf.

To convert a document into PDF we can use Universal Office Converter (unoconv) command line utility.
It can be installed on your OS by any package manager e.g. To install it on ubuntu using apt-get
sudo apt-get install unoconv
As per documentation of unoconv
If you installed unoconv by hand, make sure you have the required LibreOffice or OpenOffice packages installed
Following example demonstrate how to invoke unoconv utility
unoconv -f pdf sample_document.py
It generates PDF document that contains content of sample_document.py
If you want to use a nodeJS program then you can invoke the command through child process
Find code below that demonstrates how to use child process for using the unoconv for creating PDF
const util = require('util');
const exec = util.promisify(require('child_process').exec);
async function createPDFExample() {
const { stdout, stderr } = await exec('unoconv -f pdf sample.js');
console.log('stdout:', stdout);
console.log('stderr:', stderr);
}
createPDFExample();

Posting a slightly modified version for excel, based upon the answer provided by #shubham singh. I tried it and it worked perfectly.
const fs = require('fs').promises;
const path = require('path');
const { promisify } = require('bluebird');
const libre = require('libreoffice-convert');
const libreConvert = promisify(libre.convert);
// get current working directory
let workDir = path.dirname(process.mainModule.filename)
// read excel file
let data = await fs.readFile(
`${workDir}/my_excel.xlsx`
);
// create pdf file from excel
let pdfFile = await libreConvert(data, '.pdf', undefined);
// write new pdf file to directory
await fs.writeFile(
`${workDir}/my_pdf.pdf`,
pdfFile
);

Docx to pdf
A library that converts docx file to pdf.
Installation:
npm install docx-pdf --save
Usage
var docxConverter = require('docx-pdf');
docxConverter('./input.docx','./output.pdf',function(err,result){
if(err){
console.log(err);
}
console.log('result'+result);
});
its basically docxConverter(inputPath,outPath,function(err,result){
if(err){
console.log(err);
}
console.log('result'+result);
});
Output should be output.pdf which will be produced on the output path your provided

Related

File download freezes and does not finish on linux

I'm using "Node.js", "express" and "SheetJS" so that an endpoint that saves the data (from an array of objects) in an XLSX file and returns the url of the file to be downloaded by another endpoint as a static file.
import crypto from 'crypto';
import * as XLSX from 'xlsx';
import path from 'path';
import * as fs from 'fs';
...
const exportToExcelFile = async (data) => {
...
const worksheet = XLSX.utils.json_to_sheet(data);
const workbook = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(workbook, worksheet, 'Data');
const buf = XLSX.write(workbook, { bookType: 'xlsx', type: 'buffer' });
fs.writeFileSync(resolvedFilename, buf);
return `${process.env.APP_URL}/public/downloads/${date}/${filename}`;
}
In Windows, the file generation and download work perfectly, however, when the application is running on the linux server the file is generated, however, the download freezes and does not finish.
[Download congelado][1]
If I change the 'buffer' type to 'binary', the download works on windows and linux, however, in both when trying to open the file, Excel shows a corrupted file message.
const buf = XLSX.write(workbook, { bookType: 'xlsx', type: 'binary' });
Any ideas or suggestions of what it could be?

Does it help if you close the file after writing?
const fs = require("fs/promises");
(async function() {
var file = await fs.open(resolvedFilename, "w");
await file.write(buf);
await file.close();
})();

It works just fine, you can check your code live here: https://glitch.com/edit/#!/pentagonal-sepia-nutmeg
All I do is just copy/paste your code into glitch to see if it works. And it does.
So you should check your browser network tab, see if it reports any error. Also, take advantage of some tools such as curl with -v option to download the file, it will print all information about the download request you make

use .sfz soundfonts to render audio with WebMScore

I'm using WebMScore to render audio of music scores (it's a fork of MuseScore that runs in the browser or node).
I can successfully load my own, local .sf2 or .sf3 files, however
Trying to load an .sfz soundfont throws error 15424120. (And error.message is simply 'undefined'.)
Unlike .sf2 and .sf3, which contain the sounds and instructions in a single file, the .sfz format is just a text instruction file that refers to a separate folder of samples.
The reason I need the .sfz is that I need to be able to edit the .sfz file textually and programatically without an intervening Soundfont generator.
Is there a way to use .sfz's? Do I need to specify Zerberus (the Musescore .sfz player)? Do I need a different file structure? Please see below.
My environment is node js, with the following test case and file structure:
File Structure
Project Folder
app.js
testScore.mscz
mySFZ.sfz
samples
one.wav
two.wav
etc.wav
Test Case (Works with .sf3 , errors with .sfz)
const WebMscore = require('webmscore');
const fs = require('fs');
// free example scores available at https://musescore.com/openscore/scores
const name = 'testScore.mscz';
const exportedPrefix = 'exported';
const filedata = fs.readFileSync(`./${name}`);
WebMscore.ready.then(async () => {
const score = await WebMscore.load('mscz', filedata, [], false);
await score.setSoundFont(fs.readFileSync('./mySFZ.sfz'));
try { fs.writeFileSync(`./${exportedPrefix}.mp3`, await score.saveAudio('mp3')); }
catch (err) { console.log(err) }
score.destroy();
});

Playwright: Upload files from non-input element that cannot be used page.setInputFiles?

I'm working on uploading files through non-input HTML tag on Playwright.
For example, you can use setInputFiles like this, and this works:
await page.setInputFiles('input[type="file"]', './headphone.png')
But apparently setInputFiles only works for input element, something like this will be error:
await page.setInputFiles('label.ImageUpload__label ', './headphone.png');
The HTML I'm working on is like this:
<div id="ImageUpload" class="ImageUpload u-marginB10">
<label class="ImageUpload__label js-dragdrop-area" for="selectFileMultiple">
<span class="ImageUpload__hide">drag and drop or select files</span>
<span class="ImageUpload__text"><span class="js-dragdrop-num">10</span>up to</span>
</label>
</div>
So, is it possible to upload files to such HTML elements with Playwright?

NodeJs: https://playwright.dev/python/docs/api/class-filechooser
page.on("filechooser", (fileChooser: FileChooser) => {
fileChooser.setFiles(["/path/to/a/file"]);
})
Python: https://playwright.dev/python/docs/api/class-filechooser/
with page.expect_file_chooser() as fc_info:
page.click("upload")
file_chooser = fc_info.value
file_chooser.set_files("/path/to/a/file")
Java: https://playwright.dev/java/docs/api/class-filechooser
FileChooser fileChooser = page.waitForFileChooser(() ->
page.click("upload"));
fileChooser.setFiles(Paths.get("myfile.pdf"));

To upload a file using Playwright use setInputFiles(selector, files[, options]) function. This method takes the selector of the input element and the path to the file you want to upload.
The files parameter value can be a relative path (relative to the current working directory) or an absolute path. I strongly suggest that you use an absolute path to ensure predictable behavior.
test("upload a file", async ({ page }) => {
console.log(resolve(__dirname, "bar.png"));
await page.goto("http://127.0.0.1:8080/upload-file/");
await page.locator('input[name="foo"]').click();
await page
.locator('input[name="foo"]')
.setInputFiles(resolve(__dirname, "bar.png"));
await page.click("input[type=submit]");
});
Alternatively, you can read the file into a Buffer and dispatch drop event onto the target element with DataTransfer payload. This is useful when you are testing a drag-and-drop area:
const dataTransfer = await page.evaluateHandle(
async ({ fileHex, localFileName, localFileType }) => {
const dataTransfer = new DataTransfer();
dataTransfer.items.add(
new File([fileHex], localFileName, { type: localFileType })
);
return dataTransfer;
},
{
fileHex: (await readFile(resolve(__dirname, "bar.png"))).toString("hex"),
localFileName: fileName,
localFileType: fileType,
}
);
await page.dispatchEvent("#drop_zone", "drop", { dataTransfer });
await expect(page.locator("text=bar.png")).toBeVisible();
You can further simplify the above code using createDataTransfer utility from playwright-utilities:
const dataTransfer = await createDataTransfer({
page,
filePath: resolve(__dirname, "bar.png"),
fileName: "bar.png",
fileType: "image/png",
});
await page.dispatchEvent("#drop_zone", "drop", { dataTransfer });
await expect(page.locator("text=bar.png")).toBeVisible();
Try this example locally by cloning the Playwright Playground repository:
git clone --branch test/upload-file https://punkpeye#github.com/punkpeye/playwright-playground.git
cd playwright-playground
npm install
npx playwright test tests/upload-file

Found another alternative to upload that worked in my case. We create a buffer from memory and drag and drop the file to the upload button.
// Read your file into a buffer.
const buffer = readFileSync('file.pdf');
// Create the DataTransfer and File
const dataTransfer = await scope.page.evaluateHandle((data) => {
const dt = new DataTransfer();
// Convert the buffer to a hex array
const file = new File([data.toString('hex')], 'file.pdf', { type: 'application/pdf' });
dt.items.add(file);
return dt;
}, buffer);
// Now dispatch
await page.dispatchEvent('YOUR_TARGET_SELECTOR', 'drop', { dataTransfer });
if using typescript, add this to the top of the file:
import {readFileSync} from 'fs';
Github issue: https://github.com/microsoft/playwright/issues/10667#issuecomment-998397241

I had the same issue so I decided to use AutoIt to upload files with Playwright.
AutoIt v3 is a freeware BASIC-like scripting language designed for automating Windows GUI and general scripting.
I used AutoIT to handle the Windows File Upload dialog, which cannot be handled using Playwright.
Creating Script
Download AutoIt: https://www.autoitscript.com/site/autoit/downloads/
Open SciTE Script Editor and type the followng:
WinWaitActive("Choose files")
Send("C:\ChromeDriver\text.txt")
Send("{ENTER}")
If it does not work, change Choose files to whatever title is on the top left of the upload dialog.
Click save and name it something like upload.au3 and save it in the root directory of your test.
Example of Save Location
Right click your newly created file and click Compile Script
Executing the script in your test
Create execFile function of child process modules in node.js. Reference: https://nodejs.org/api/child_process.html#child_process_child_process_execfile_file_args_options_callback
Add this to the top of your .spec.ts test file:
var exec = require('child_process').execFile;
var upload_script = function(){
exec('upload.exe', function(err, data) {
console.log(err)
});
}
Open the upload dialog, then call the function in your test
// Click Browse
await page.locator('#browse').click();
// Execute Upload Script
upload_script();
You have to run your test headed or it will not work:
npx playwright test --headed

Redirect Readable object stdout process to file in node

I use an NPM library to parse markdown to HTML like this:
var Markdown = require('markdown-to-html').Markdown;
var md = new Markdown();
...
md.render('./test', opts, function(err) {
md.pipe(process.stdout)
});
This outputs the result to my terminal as intended.
However, I need the result inside the execution of my node program. I thought about writing the output stream to file and then reading it in at a later time but I can't figure out a way to write the output to a file instead.
I tried to play around var file = fs.createWriteStream('./test.html'); but the node.js streams rather give me headaches than results.
I've also looked into the library's repo and Markdown inherits from Readable via util like this:
var util = require('util');
var Readable = require('stream').Readable;
util.inherits(Markdown, Readable);
Any resources or advice would be highly appreciated. (I would also take another library for parsing the markdown, but this gave me the best results so far)

Actually creating a writable file-stream and piping the markdown to this stream should work just fine. Try it with:
const writeStream = fs.createWriteStream('./output.html');
md.render('./test', opts, function(err) {
md.pipe(writeStream)
});
// in case of errors you should handle them
writeStream.on('error', function (err) {
console.log(err);
});

Generating a json for a icon cheatsheet

I'm trying to generate a json file containing the filenames of all the files in a certain directory. I need this to create a cheatsheet for icons.
Currently I'm trying to run a script locally via terminal, to generate the json. That json will be the input for a react component that will display icons. That component works, the create json script doesn't.
Code for generating the json
const fs = require('fs');
const path = require('path');
/**
* Create JSON file
*/
const CreateJson = () => {
const files = [];
const dir = '../icons';
fs.readdirSync(dir).forEach(filename => {
const name = path.parse(filename);
const filepath = path.resolve(dir, filename);
const stat = fs.statSync(filepath);
const isFile = stat.isFile();
if (isFile) files.push({ name });
});
const data = JSON.stringify(files, null, 2);
fs.writeFileSync('../Icons.json', data);
};
module.exports = CreateJson;
I run it in terminal using
"create:json": "NODE_ENV=build node ./scripts/CreateJson.js"
I expect a json file to be created/overridden. But terminal returns:
$ NODE_ENV=build node ./scripts/CreateJson.js
✨ Done in 0.16s.
Any pointers?

You are creating a function CreateJson and exporting it, but you are actually never calling it.
You can get rid of the module.exports and replace it with CreateJson().
When you'll execute the file with node, it will see the function declaration, and a call to it, whereas with your current code there is no call.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Nodejs: Convert Doc to PDF - node.js

Related

File download freezes and does not finish on linux

use .sfz soundfonts to render audio with WebMScore

Playwright: Upload files from non-input element that cannot be used page.setInputFiles?

Redirect Readable object stdout process to file in node

Generating a json for a icon cheatsheet

Categories

Resources