How to embed an MP4 inside a PDF? - linux

I am a happy user of img2pdf. This tool does the minimal amount of work to put a series of JPEG 2000/JPEG/PNG images into a PDF "enveloppe". However I am now faced with a new challenge: embed a MP4 file into a PDF "enveloppe".
I see that commercial tool can do it, as seen at:
Add audio, video, and interactive objects to PDFs
Here is one such sample PDF file (no Flash required on windows in this sample):
https://gitlab.com/agrahn/media9/-/issues/9#note_345903962
https://gitlab.com/agrahn/media9/uploads/90fddd777e0ec514c39c924cd8d3b688/video_test.pdf
It seems to have been introduced in ISO 32000-1 (PDF 1.7 Extension Level 5)
I am looking for a solution which will use the Rich Media annotation inside the PDF stream.
There are dozen of duplicated questions on superuser/stackoverflow, which all pretty much refer to imagemagick/convert command line tool. But in my case, convert expand the images into a multi-page PDF (which is not my desired behavior):
$ convert input.mp4 output.pdf
$ pdfinfo output.pdf
Title: out
Producer: https://imagemagick.org
CreationDate: Wed Aug 19 15:38:01 2020 CEST
ModDate: Wed Aug 19 15:38:01 2020 CEST
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 1601
Encrypted: no
Page size: 352 x 288 pts
Page rot: 0
File size: 534407296 bytes
Optimized: no
PDF version: 1.3
with:
$ convert --version
Version: ImageMagick 6.9.10-23 Q16 x86_64 20190101 https://imagemagick.org
Copyright: © 1999-2019 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC Modules OpenMP
Delegates (built-in): bzlib djvu fftw fontconfig freetype jbig jng jpeg lcms lqr ltdl lzma openexr pangocairo png tiff webp wmf x xml zlib
and
$ file input.mp4
input.mp4: ISO Media, MP4 Base Media v1 [IS0 14496-12:2003]
$ ffprobe -v quiet -print_format json -show_streams input.mp4 | grep codec_long_name
"codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
How would you embed an MP4 inside a PDF now that Flash support is being removed from Acrobat (Dec 2020) ? The solution should be on the command line (linux based system).

It was common and still possible to use Rich Media Annotation to include 3D animations or Media files within a PDF. Generally you need top end editors such as Acrobat PRO but there are a few LaTeX editor modules that some times work, thus can be PDFLaTeX compiled from Linux command line. for outdated example app see http://www.acrotex.net/blog/?cat=22 for an overflow example see https://www.overleaf.com/project/5ff76fa5686edd3e034cfedb and for prior adobe reply (but did not work for a while) see
> Embedded media, as well as referenced media outside a PDF file, may be played with a variety of player software. (In some situations, the player software may be the conforming reader itself.)
[Later comments] Adobe shot selves in foot with the poor closure of their buggy insecure SWFlash and only improved some rich media handling in more recent Windows Reader versions Acrobat DC - 21.001.20135 plus ! having turned their back on maintaining Portable Document Format Readers for Linux/Mac. What is needed is a push to use HTMLZ as ideal Rich Media Format but that would need Google Chrome to run with the bouton (pun).
Its NOT recommended except for 3D PDF as most methods require manual over-ridding security measures to STOP runtime applications within a PDF.
SWF/Flash is no longer acceptable for that reason. MicroSoft Edge (pre-chrome) made an attempt at imbedding pdf links to You Tube videos but AFAIK that was abandoned. Thus RMA pdfs can not run in more common browsers they need specialist viewers and the best viewer for Linux is possibly Okular but I cannot run that file in my version.
When opening a 3D file you need to jump through multiple hoops to allow and for floating video it may not even run. your links lead to this media example which can run inline (seen above) or better pop out with system application controls.
However in some viewers it needs to be manually exported from the pdf archive as an attachment to be run in a system media player. and for browser presentations it can be a local hyperlink like this.
Using Okular on windows it does not ask prior to running content but that could be because it found no suitable player, however it allows me to link to a local file and run that in system viewer.
For everyday presentations its easier to include the media file in a zip with the pdf or PowerPoint presentation for running locally from the pdf.
[Updated PoC for building using alternative raw mp4's]
It is possible to write a complex PDF in text on the console (Here is a 203 line example in Windows CMD), Typical output I would not normally suggest that as an answer to a highly complex structure such as Rich Media, but a simpler all platforms approach is possible with a small nip and tuck of header & trailer, plus variable mp4 body.
Source modified example spliced as 3 parts https://transfer.sh/8kQIbB/all.pdf
Different body with a small amount of command line math https://transfer.sh/Uqmv6t/all2.pdf
Method
Store the text header as a pre-set template then append the mp4 without changes finally add the PDF trailer with modified values from file lengths. too long to describe its now 216 lines (with comments and notes) and working well for PDFs in xChange as a drag and drop any 1-9 MB.mp4 (need to up that math value to 1999 MB) or send file to or CLI command for single files, but the programming can be done simply as I did using CMD or OS script and the result generated in windows with roughly scripting around
copy /b head2.bin + pixels.mp4 + tail2.bin all2.pdf /b
the secondary part is how to use text to overlay a cover image in as few lines as possible
So that is now scripted to add variable length of any video.mp4 so I simply run or drag and drop per consol and a dialog can show progress and show feedback and get inputs such as names or dimensions via mp4toPDF video.mp4 [output.pdf] so next step is user to add the caption (perhaps other scalars) as variable argument(s).
The number of PDF viewers supporting Rich Media is dwindling, I can't use Acrobat nor Edge either, so it seems I need to use Tracker (below) which is much more versatile and has many other advantages, but is Windows only.
or Cross platform Foxit. However, on Windows Foxit is way inferior with no resize or search bar or other floating controls.
So currently I can add via script and run in either edit viewer a mp4 or wmv or other video under 2GB but the field (locked aspect in Foxit) has no cover (plain white) however if I move an image over the top it seems to block out action but under white its unseen so need to resolve that transparency issue, have settled on stamp bigger white area to keep the run button visible. but having some issues with auto stamping its affecting run button even when the two are not overlapping
Breaking news, OMG, no idea which way this will help any one, other than the "Revenue Men"
Microsoft Edge’s new PDF viewer is powered by Adobe, and it won’t let you forget that. In an announcement on its website, Microsoft says it’s replacing Edge’s existing PDF viewer with one from Adobe Acrobat, which includes some “advanced” features that are available if you’re willing to pay for them.

Video controls support are disabled in Adobe Acrobat, and is not supported by web browsers. Although you want ta add video with video controls, you can use Adobe Actrobat DC Pro, and you can automatize it using Action wizard.
Check this out https://helpx.adobe.com/acrobat/using/action-wizard-acrobat-pro.html

You can make a python script to embed your video.
Thanks to pyPdf2 api, you can use the addAttachment method to embed your video.
https://github.com/mstamy2/PyPDF2

Related

GhostScript - ImageMagick converts pdf to image to odd letters when converting Microsoft Print to PDF files

NOTICE: Watch updates at bottom.
I am building an API which supposed to convert PDF to base64 images (doesn't matter which type - jpg, jpeg, png..).
The API is built with NodeJS on CentOS 7.5 x64.
I have searched all over the web for npm packages which converts pdf to images, the very most of them uses ImageMagick and GhostScript (The others doesn't seem to work). These packages work well on code but the problem starts when GhostScript does it job.
For example, a simple pdf page with text will look like this after conversion:
This is the output in shell:
**** Warning: can't process font stream, loading font by the name.
**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> Microsoft: Print To PDF <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
I have tried to convert the images with shell commands ended up with the same outputs.
Thanks by advance.
UPDATE:
Converting a sample pdf file which probably was not printed to pdf by Microsoft worked fine, maybe this is the problem?
UPDATE 2:
After converting a few more pdfs it turns out that this is Microsoft Print to PDF files only that making this problem.
This was reported as a bug to the Ghostscript Bugzilla here
As can be seen from the thread, this is due to using an old version of Ghostscript, and has been fixed at some point in the past. So the problem is due to using old (in this case more than 5 years old) software.

Remove images (with transparency/alpha channel) from PDF

How to remove images with alpha channel (transparency) in a PDF file?
I need to remove all images with transparency from a PDF file because it needs to be optimized with pdf2ps and ps2pdf (to reduce filesize).. Postscript doesn't work properly when the PDF contains images with transparency and the PDF will be converted to one big image..
I have not managed to reproduce your problem.
For cons, I did the same treatment to compress my pdf except that I used pdftops instead of pdf2ps.
I hope it will help.
Sorry for my english (translate.google)
Clark,
It sounds like www.pstill.com will do everything you need and more in one tool. There is a Linux command line version available for a very reasonable price. I have used the tool on a few different PDF's for different reasons and it has always worked as advertised.
From their website.
Putting the 'Portable' back in PDF - PDF to PDF Transcoding
Your PDF cannot be printed on some printers or processed with some applications? PStill can sanitize, simplify, reprocess, flatten transparency and recompress PDF-Files, this process also known as 'transcoding' create a new PDF that has better compatibility, is often smaller in file size, can be optional encrypted/secured and contain only a uniform set of font types. Fonts can be normalized to plain PostScript Type 1 formats, can be subsetted, missing fonts included and bad fonts repaired/replaced. PStill can detect and remove duplicate elements in the PDF. Text can be converted to outlines which makes it perfect for creating 'fontless' PDF. Transcoding can be used to repair bad PDF or simplify the PDF structure so more limited output devices can process it.
Andrew.

Render swf to png or other image format

How can I, on linux, render a swf to a image file?
I need to be able to load other swfs into that swf and run actionscript code.
Is it even possible on linux? I need to do it from PHP, it's fine if I have to use command-line tools.
swfrender from swftools works for basic SWF files.
swfdec-thumbnailer from swfdec-gnome works though it only gets the first frame of the swf.
To get any frame from swf using swfdec see the C code snippet in the following mailing list post.
gnash from gnash also works gnash -s<scale-image-factor> --screenshot last --screenshot-file output.png -1 -r1 input.swf, last image of the swf.
ffmpeg from ffmpeg also works for some swf formats ffmpeg -i movie.swf -f image2 -vcodec png movie%d.png
Also see the following guide for a commandline pipeline.
In order to call external programs from php you use the exec command documented here.
Note that for security reasons it is important to escape arguments passed to exec with another command like escapeshellcmd or escapeshellarg for security reasons.
Once you have converted to an image format whether for single frame or all frame, you can't run action script. Other non GNU / Linux tools support the export of the action script from from SWF.
If the SWF that you are exporting to PNG is too complicated for the other tools than you can use the Flash Plugin or Gnash and Xvfb along with screen capture software to capture either image frames of the SWF or a video format like avi. Then you can extract the images from the video format.
This virtual framebuffer method will support complicated SWF files, though it requires a lot of work as you need to use either Gnash and Xvfb and Screen Capture, or a browser , Xvfb and Selenium, if you want to capture a certain set of mouse / keyboard interactions with the SWF.
Gnash with and without the Virtual FrameBuffer should load the ActionScript before exporting, but may have issues with complicated ActionScript. Flash Plugin with Virtual Framebuffer will load the ActionScript before exporting.
Also see the following StackOverFlow questions, which you question is a duplicate of
Convert SWF to PNG
Render Flash (SWF) frame as image (PDF,PNG,JPG)
SWF to image (jpg, png, …) with PHP
This is the solution I ended up using.
You can use a tool like Xvfb (X11 server) and run the standalone flash player projector inside it (you may need to install a bunch of 32-bit libraries), then use a screen capture utility like import to capture the screen and crop it to size.
I found this page on rendering swf screenshots in linux helpful. It also says that you can use gnash to do this, however gnash won't work for flash player 9+.
Try this air application http://swfrenderer.kurst.co.uk
It render swf frame by frame

iPhone App Dev - Edited mp3 files are not working in App

In my application there are mp3 files located in the bundle (nothing from the web). Some of the mp3 files are original files and some I had edited using simple sound editing software (the ones where you insert a file, cut a slice of it and save it as a new and shorter mp3 file).
I'm using the AVAudioPlayer [initWithData] method.
All the original files (the ones that I hadn't edited and inserted to the bundle as is) are working perfectly and all the ones that were edited are not working at all.
I used 2 different editing software and the outcome is the same.
Anyone had ever encountered that or have any idea what may I done wrong?
Thanks,
Ohad
Converting the mp3 to caf worked for me
as specified here.
see the following:
How do I convert an audio file to the preferred format for iPhone OS?
The preferred full-quality audio format for iPhone OS is 16-bit, little-endian, linear PCM packaged as a CAF file. To convert an audio file to this format, use the afconvert tool at the command line in Mac OS X, as shown in Listing 5.
Listing 5 Converting an audio file to the preferred format for iPhone OS
/usr/bin/afconvert -f caff -d LEI16 {INPUT} {OUTPUT}
To see all the options available for the afconvert tool, enter afconvert -h at the command line.

Bash-script printing a pdf to a pdf in Linux

The question probably sounds a little odd, but the actual task is relatively simple, I swear!
I'm automatically generating some PDFs from a webform, using PDFCreator to merge a generated FDF into a preexisting PDF. I created the preexisting PDF in NitroPDF. This setup works great - almost. The problem is that when you view the generated PDFs in Adobe Reader 9 (the most common reader) a subset of the fields are just blank. The information is still there; using previous versions of Adobe Reader or a different reader like Foxit Reader shows the entire PDF. No clue what's going on, and Adobe tech support was useless since I didn't create the PDF with Adobe software. (If you'd like to help fix this problem instead of the following, feel free to email me.)
However, if I take the resultant PDF and print it to a fresh PDF using a PDF printer driver, it works great everywhere. This is time-consuming and annoying for our sales department to do themselves, so I want to perform this step automagically upon creating the first PDF.
I'm in ubuntu, and have command-line root access to the server. The program is written in PHP, and can easily make system calls. I'm just having trouble figuring out how to tie things together properly so that I can automatically print a known file using a specific printer driver to another known file.
You could try putting your PDF files through Ghostscript. I have found that this is enough to fix many problematic PDFs.
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf
(The same command can also be used to merge several PDF files into one, just specify multiple input files.)

Resources