How to cut last page from PDF on linux server - linux

I've more than 500 PDF files stored on linux server with 5 pages each. I want only first 4 pages in each file. Is there any way to cut last page from all 500 PDF pages on linux ?

You can us the tool pdftk (need to be installed)
To only have page 1-4 from the in.pdf in your out.pdf file you have to type
pdftk in.pdf cat 1-4 output out.pdf
pdftk is a very powerfull tool, which can do a number of pdf manipulations. Have a look at the man page. At the end of the man pages are some examples of the most common tasks.

Related

how to merge pdf as a table with pdftk or convert

How can one use convert or pdftk to merge several pdfs organized as a table?
For example, given 4 files: file1.pdf, file2.pdf, file3.pdf, file4.pdf, each of a single page, I would like to have a single-page pdf like
file1.pdf file2.pdf
file3.pdf file4.pdf
That is, the files are arranged like an array.
By far the easiest way to convert 4 PDF pages to 1 page on any OS is by N-Up imposition/printing with output to a virtual PDF printer such as Ghostscript. For the most basic 4-Up command line usage see https://stackoverflow.com/a/72850245/10802527
Thus to combine 4 pages (others such as 2 6 9 or 16 are possible) using here in a gui I can very easily set the order.
On Linux or MacOS you can use, along with other options, the CUPS command
lp -o number-up=4 filename
see https://www.cups.org/doc/options.html
The major advantage over using tools such as PDFtk with convert is that it resolves both scaling and preserving most PDF structures without degrading to inferior down-scaled imagery by NOT passing in and out of images before calling Ghostscript.
If you have single pdfs then you can merge before print using PDFtk (uses Ghostscript) instead of poppler pdfunite. Note that with either the Original PDF format is preserved.
If you want to convert to half size images and stitch them together, then reprint to one pdf page, then that can easily be done using imagemagik convert and other commands to call Ghostscript to suit your requirements direct. However, the results will in many ways be degraded by translation to image output.
Since all of the above pass through GS it makes sense, where possible, to install GS as a PDF printer driver.
If you want to avoid installing GhostScript printing then you can use cross platform Coherent cpdf (it only uses GS if the files need repairs)
Note these are "windows double quoted names" adjust as required and is based on the 4 sequential pages in one file are then to be placed 4 at a time on each new page, thus can be used with any multiple of pages in the input.pdf
cpdf -twoup "input.pdf" -o "in-2-Up-tmp.pdf"
cpdf "in-2-Up-tmp.pdf" -rotate 90 -o "out-2-Up.pdf"
cpdf -twoup "out-2-Up.pdf" -o "out-4-Up-tmp.pdf"
cpdf "out-4-Up-tmp.pdf" -rotate 90 -o "out-4-Up.pdf"

Extract all pages as single from a pdf with pdftk

how can I extract single pages from a pdf using pdftk commands? Thanks for your help!
From the quick look at the pdftk homepage
For example if you want to extract 11th page then you can do it like this
pdftk A=full-pdf.pdf cat A11 output outfile_p11.pdf

How do you print a multipage tiff file using CUPS (lp command)?

On Linux system (Ubuntu) I have a multipage TIFF file (file.tiff).
When I send it to a printer using "lp file.tiff" command, only the first page prints.
How do I print all the pages?
I have the following known options:
Split the file to single-page TIFFs
Convert TIFF to PDF
I'd like to keep the multi-page TIFF and avoid creating other formats. Is there a way to make CUPS print all the pages from the multipage TIFF file?
(Please do not offer "convert the file" as an answer as I know those, I'm looking for a CUPS method, lpprintmultipagetiff --please?).
Use tiff2ps. The link is below. You could also setup a dirty loop to print each page manually with cups.
for((i=1;i<=884;i++)); do <your lpr print command>; done
Note: 884 is the last page number... I'm just guessing. Use $i in your lpr print command when printing the desired page.
http://linux.about.com/library/cmd/blcmdl1_tiff2ps.htm

How to take snapshot of PDF file in linux?

How I can take snapshot of first page of PDF file in Linux? I wanna do this on VPS server automaticaly. My distribution is Debian.
ImageMagick can convert PDF pages if you have Ghostscript installed.
You can do this with PDFTK. It's available in the Ubuntu repos, so check there first.
The syntax you'll want to grab the first page is:
$ pdftk input.pdf cat 1 output out.pdf
Press & Hold SHIFT + Print_screen KEYS and with mouse select the rectangle of pdf page you want to take screenshot ...

Unable to search pdf-files' contents in terminal

I have pdf -files which contents I have not managed to search by any terminal program.
I can only search them by Acrobat Reader and Skim.
How can you search contents of pdf -files in terminal?
It seems that a better question is
How is the search done in the pdf viewers such as Acrobat Reader and Skim?
Perhaps, I need to make such a search tool if no such tools exist.
Try installing xpdf from MacPorts; it is supposed to come with a tool called pdftotext which should then allow you to search using grep.
pdftotext is indeed an excellent tool, but it produces very long lines; in order to grep you will want to break them up, e.g.,
pdftotext drscheme.pdf - | fmt | grep -i spidey
PDF files are usually compressed. PDF viewers such as Acrobat Reader and Skim search the contents by decompressing the PDF text into memory, and then searching that text. If you want to search from the command line, one possible suggestion is to use pdftk to decompress the PDF, and then use grep (or your favorite command line text searching utility) to find the desired text. For example:
# Search for the text "text_to_search_for", and print out 3 lines of context
# above and below each match
pdftk mydoc.pdf output - uncompress | grep -C3 text_to_search_for

Resources