Create tar file from files in a particular directory - node.js

I need to use nodejs to create a tar file that isn't encompassed in a parent directory.
For example, here is the file system:
/tmp/mydir
/tmp/mydir/Dockerfile
/tmp/mydir/anotherfile
What I'm looking to do is the equivalent to this:
cd /tmp/mydir
tar -cvf archive.tar *
So, when I extract archive.tar, Dockerfile will end up in the same directory I execute the command.
I've tried tar.gz and a few others, but all the examples are compressing an entire directory, and not just files.
I'm doing this so I can utilize the Docker REST API to send builds.

With a modern module node-tar you can create a .tar file like this:
tar.create(
{ file: 'archive.tar' },
['/tmp/mydir']
).then(_ => { .. tarball has been created .. })
The tar.gz module referenced in other answers is deprecated.

Use tar.gz module. Here is a sample code
var targz = require('tar.gz');
var compress = new targz().compress('/path/to/compress', '/path/to/store.tar.gz',
function(err){
if(err)
console.log(err);
console.log('The compression has ended!');
});
For more options, visit the documentation page.
This package is now deprecated. Check the answer provided by #Kelin.

Second argument to the constructor is passed on as properties to the tar module.
var TarGz = require('tar.gz');
var compressor = new TarGz({}, {fromBase: true});
This will use create the archive without top level directory.
Edit: This was undocumented in node-tar.gz but pull request has now been merged: https://github.com/alanhoff/node-tar.gz#tar-options

Related

How files in a lambda layer will be copied to the /opt/bin directory?

I am working with a PDF to Image Conversion Project in AWS Lambda, but had some issues, since AWS lambda doesn't have the relevant binaries like ImageMagick in the environment, then I followed some links and StackOverflow question and put the relevant binaries as a layer, for the job I had to use Ghostscript compiled binaries.
The layer zip contain files like this
GhostScript.zip > bin > gs
I have a wrapper library call pdf2png and It will execute a child process which do the convertion, the command this child process use is the above mentioned gs utitity, but my issue is the path I mentioned for the utility is wrong, it throws an error saying,
Error: spawn /opt/bin/bin/gs ENOENT
So What I want to know is how will the lambda layer files be copied to the /opt/bin/ directory? how should I replace the path?
Corresponding code,
gs()
.batch()
.nopause()
.option('-r' + options.density)
// .option('-dDownScaleFactor=2')
.option('-dFirstPage=' + page)
.option('-dLastPage=' + page)
.executablePath('/opt/bin/bin/gs')
.device('png16m')
.output(output)
.input(filepath)
.exec(function (err, stdout, stderr) {
The executable path in my case have to be like this
.executablePath('/opt/bin/gs')
It has extracted the files in the bin folder inside the layer into the /opt/bin/ folder directly.

AWS Lambda function - convert PDF to Image

I am developing application where user can upload some drawings in pdf format. Uploaded files are stored on S3. After uploading, files has to be converted to images. For this purpose I have created lambda function which downloads file from S3 to /tmp folder in lambda execution environment and then I call ‘convert’ command from imagemagick.
convert sourceFile.pdf targetFile.png
Lambda runtime environment is nodejs 4.3. Memory is set to 128MB, timeout 30 sec.
Now the problem is that some files are converted successfully while others are failing with the following error:
{ [Error: Command failed: /bin/sh -c convert /tmp/sourceFile.pdf
/tmp/targetFile.png convert: %s' (%d) "gs" -q -dQUIET -dSAFER -dBATCH
-dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pngalpha" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72" "-sOutputFile=/tmp/magick-QRH6nVLV--0000001" "-f/tmp/magick-B610L5uo"
"-f/tmp/magick-tIe1MjeR" # error/utility.c/SystemCommand/1890.
convert: Postscript delegate failed/tmp/sourceFile.pdf': No such
file or directory # error/pdf.c/ReadPDFImage/678. convert: no images
defined `/tmp/targetFile.png' #
error/convert.c/ConvertImageCommand/3046. ] killed: false, code: 1,
signal: null, cmd: '/bin/sh -c convert /tmp/sourceFile.pdf
/tmp/targetFile.png' }
At first I did not understand why this happens, then I tried to convert problematic files on my local Ubuntu machine with the same command. This is the output from terminal:
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> Mac OS X 10.10.5 Quartz PDFContext <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
So the message was very clear, but the file gets converted to png anyway. If I try to do convert source.pdf target.pdf and after that convert target.pdf image.png, file is repaired and converted without any errors. This doesn’t work with lambda.
Since the same thing works on one environment but not on the other, my best guess is that the version of Ghostscript is the problem. Installed version on AMI is 8.70. On my local machine Ghostsript version is 9.18.
My questions are:
Is the version of ghostscript problem? Is this a bug with older
version of ghostscript? If not, how can I tell ghostscript (with or
without using imagemagick) to repair or ignore errors like it does on
my local environment?
If the old version is a problem, is it possible to build ghostscript
from source, create nodejs module and then use that version of
ghostscript instead the one that is installed?
Is there an easier way to convert pdf to image without using
imagemagick and ghostscript?
UPDATE
Relevant part of lambda code:
var exec = require('child_process').exec;
var AWS = require('aws-sdk');
var fs = require('fs');
...
var localSourceFile = '/tmp/sourceFile.pdf';
var localTargetFile = '/tmp/targetFile.png';
var writeStream = fs.createWriteStream(localSourceFile);
writeStream.write(body);
writeStream.end();
writeStream.on('error', function (err) {
console.log("Error writing data from s3 to tmp folder.");
context.fail(err);
});
writeStream.on('finish', function () {
var cmd = 'convert ' + localSourceFile + ' ' + localTargetFile;
exec(cmd, function (err, stdout, stderr ) {
if (err) {
console.log("Error executing convert command.");
context.fail(err);
}
if (stderr) {
console.log("Command executed successfully but returned error.");
context.fail(stderr);
}else{
//file converted successfully - do something...
}
});
});
You can find a compiled version of Ghostscript for Lambda in the following repository.
You should add the files to the zip file that you are uploading as the source code to AWS Lambda.
https://github.com/sina-masnadi/lambda-ghostscript
This is an npm package to call Ghostscript functions:
https://github.com/sina-masnadi/node-gs
After copying the compiled Ghostscript files to your project and adding the npm package, you can use the executablePath('path to ghostscript') function to point the package to the compiled Ghostscript files that you added earlier.
Its almost certainly a bug, or perhaps limitation, with the older version of Ghostscript.
Many PDF producers create PDF files which do not conform to the specification, and yet will open without complain in Adobe Acrobat. Ghostscript endeavours to do the same, but obviously we can't know what Acrobat is going to allow, so we are continually chasing this nebulous target. (FWIW that warning is a legitimate out-of-spec PDF file).
There's nothing you can do with the old version other than replace it.
Yes you can build Ghostscript from source, I have no idea about a nodejs module, not sure why that's relevant.
There are numerous other applications which will render a PDF file, MuPDF is another one I know of. And, of course, you can use Ghostscript directly without using ImageMagick. Of course, if you can load another application, then you should simply be able to replace your Ghostscript installation too.
The version of GS on aws is an old version with known bugs. We can get around this by uploading an x64 GS file, compiled specifically for Linux. Then upload that using new AWS lambda layers. I have written a node function that does just this here:
https://github.com/rcastoro/PDFImagine
Make sure you have that GS layer for your lambda, however!

Adm zip write zip buffer to ExpressJS response

Hi I'm trying to send a zip buffer made by Adm Zip npm module to my response for client download.
I manage to download the zip file but unable to expand it. OSX says "error 2 No such file or directory"...
The downlaoded zip file has got the right size I believe and is sent over this way:
var zip = new AdmZip();
// added files with zip.addFile(...)
var zipFile = zip.toBuffer();
res.contentType('zip');
res.write(zipFile);
res.end();
Any idea what could be wrong?
Thanks
Apparently it comes from the Adm-zip code base and hasn't been merged yet:
https://github.com/cthackers/adm-zip/compare/master...mygoare:unzipErr

NodeJs: pipe tar file from http response to tar.Extract?

Following task:
We have a nodejs client (daemon) downloading audio data which is contained in a tar file (NOT tar.gz - it really is uncompressed!) from Amazon S3.
At the moment, we glue the chunks together in the 'data' handler of the response, save the whole buffer to disc as a file and then call tar.Extract(inPath, outPath) on the newly created file.
I'd like to skip the process of writing the data to disc and instead pass the data from the response directly to tar.Extract().
This is my handler code:
var readResponseData = function (response) {
response.setEncoding('binary');
response.pipe(tar.Extract( { path: '/tmp/testyeah' }));
....
....
I always get "Error: Invalid tar file"
I also tried without success the suggestions from this page (https://groups.google.com/forum/#!topic/nodejs/A7jz6b9daZc) although that should apply to compressed tar files and not the uncompressed ones that we use.
Any suggestions?

Customize CFEngine3 temporary downloaded files location

I am facing a issue while trying configuring something with CFENGINE3.5, I have created a policy to install some package from source, which download tar balls from some url and then untar it and further digs it with make and make install, everything working fine except while it download tar balls it keeps at "/etc" location, I want cfengine to put this file at /tmp.
Is there any way to customize this default behavior of cfengine to keep all temporary downloaded files at "/tmp" instead of "/etc".
Here is the Policy snippet:
bundle agent install
{
vars:
"packages" slist => {
"Algorithm-Diff-1.1902",
"Apache-DB-0.13",
"Apache-DBI-1.06",
"Apache-Session-1.83",
"Apache-SessionX-2.01",
"AppConfig-1.65",
"Archive-Tar-1.32",
};
commands:
"/usr/bin/wget http://10.X.X.X/downloads/perl-modules/$(packages).tar.gz;
/usr/bin/gunzip $(packages).tar.gz;
tar -xf $(packages).tar;
cd $(packages);
/usr/bin/perl Makefile.PL;
/usr/bin/make;
/usr/bin/make install;"
contain => standard,
classes => satisfied(canonify("$(packages)-installed"));
}
body contain standard
{
useshell => "true";
exec_owner => "root";
}
Thanks in advance.
You can add the directory in which the commands should be executed to the contain body, like this:
body contain standard
{
useshell => "true";
exec_owner => "root";
chdir => "/tmp";
}
Please note there are already a few contain bodies in the standard library (lib/3.5/commands.cf), maybe one of those can be used so you don't have to write your own. Note that CFEngine already executes as root, so exec_owner => "root" is not strictly necessary.

Resources