Ghostscript and high resolutions? - resolution

I am writing a script that reads some markup data, generates a tex document and converts it to a png image.
As long as I use a resolution up tp 286 px/inch everything works fine. Unfortunately GhostScript, which I use to create picture data, does nothing when I use higher values.
How can I fix this behaviour?

Since info about your problem is not very detailed (What kind of fonts are used in the TeX document? Are they Chinese, Japanese, Korean, or...? Which is the Ghostscript commandline you're using?) ... here is a thing to check. But it is only a first guess: try to add "-c "100000000 setvmthreshold" -f /path/to/pdffile.pdf" to your command:
gswin32c.exe ^
-o c:/path/to/output.png ^
-sDEVICE=png ^
-r600x600 ^
-c "100000000 setvmthreshold" ^
-f /path/to/pdffile.pdf
This will allow for ~100 MByte extra RAM usage by Ghostscript. If you are on X-Windows (Linux, Unix), then "-dMaxBitmap=..." could help (provided you've enough of RAM):
gs \
-o /path/to/output.png \
-sDEVICE=png \
-r600x600 \
-dMaxBitmap=100000000 \
-c "100000000 setvmthreshold" \
-f /path/to/pdffile.pdf

Related

HaplotypeCaller provide variants more than expected

I used HaplotypeCaller for variant calling out of WES picard.sorted.MarkedDup.bam file with GATK 4.2.6.1. HaplotypeCaller standard command line.
Apparently, everything worked well and I received standard .vcf file. But the number of identified variants are too much for WES result. It's close to one million variants for one sample!
Did I perform something wrong?
What solution do you recommend?
Any help would be appreciated.
The command line I used was as follow:
gatk --java-options -Xmx8g HaplotypeCaller \ -R $refFile \ -I ${base}.picard.sorted.markedDup.bam \ --dont-use-soft-clipped-bases -stand-call-conf 20.0 \ --emit-ref-confidence GVCF \ -O ${base}.rrrrealigned.vcf

Ghostscript command line - pass arguments to included file

I developing pdf conversion app with node.js and Ghostscript. I execute command line gs with exec(). My command definition looks like:
let gs_cmd = `
gs -sDEVICE=pdfwrite \
-dPDFX=true \
-dPDFACompatibilityPolicy=1 \
-sColorConversionStrategy=/CMYK \
-sProcessColorModel=DeviceCMYK \
-sDefaultCMYKProfile=${icc_profile_file} \
-dNoOutputFonts \
-dBATCH \
-dQUIET \
-r${DPI} \
-g${w}x${h} \
-dPDFFitPage \
-NumRenderingThreads=4 \
-o ${target_file}-conv.pdf \
PDFX_def.ps \
#trimbox.in "Trimed" \
${target_file}.pdf
`;
I have problem with line:
#trimbox.in "Trimed" \
which tells to Ghostscript to include file and pass the parameters to in. I can't find a proper way to include parameters that can be used in included file. I want to pass "Trimed" string as $0 argument which will be available in trimbox.in file. I also tried with -t=Trimmed or -t="Trimmed" without effects.
From Ghostscript docs (section 10.1):
#filename
Causes Ghostscript to read filename and treat its contents the same as the command line. (This was intended primarily for getting around DOS's 128-character limit on the length of a command line.) Switches or file names in the file may be separated by any amount of white space (space, tab, line break); there is no limit on the size of the file.
-- filename arg1 ...
-+ filename arg1 ...
Takes the next argument as a file name as usual, but takes all remaining arguments (even if they have the syntactic form of switches) and defines the name ARGUMENTS in userdict (not systemdict) as an array of those strings, before running the file. When Ghostscript finishes executing the file, it exits back to the shell.
How to achieve this?
Running my command causes error:
Error: /undefined in Trimed
Firstly you should review the Ghostscript licence to ensure your use is compliant with the licence (AGPL v3). Note that this includes software as a service applications.
"Trimed" isn't a Ghostscript switch and it isn't the name of an input file, so yes, you get an error. You can't 'pass parameters' to #file, because Ghostscript treats that, literally, as a file containing a bunch of switches. There is no command substitution or anything like that. SO you can't have $0 in the file specified by #file.
So when you say :
#PDFX_def_trimbox.ps "Trimed" \
which tells to Ghostscript to include file and pass the parameters to
in
I'm afraid you are incorrect. There is no way to 'pass parameters' to the file when using the #file syntax.
You haven't said what's in the file 'PDFX_def_trimbox.ps', and I'm suspicious (because of the .ps) that this is a PostScript program. You can't use a PostScript program with the #file syntax, because a PostScript program is not a series of Ghostscript switches.
So where you have :
-sDEVICE=pdfwrite \
-dPDFX=true\
etc, you could put all of those switches into the file specified by #file. But you can't put any PostScript in there.
There are a few other problems. You have specified NumRenderingThreads=4, which will do nothing, because the pdfwrite device doesn't (in general) do any rendering, it preserves the input as far as possible as vector data. So pdfwrite ignores this parameter altogether.
For similar reasons, the -r parameter is less than useful. In the case of pdfwrite that simply affects how accurate the conversion is. You shouldn't set that without good reason.
You've set -sColorConversionStrategy=/CMYK when it should be =-sColorConversionStrategy=CMYK or -dColorConversionStrategy=/CMYK. -s takes strings, -d takes numbers or names.
-g sets teh widht and height of the page in pixels, which isn't a great plan, that depends on the resolution. You should -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS instead, and not set the resolution.
-EDIT-
-response to comment below-
If you want PDF file to contain a 300 dpi image, then you need to create a page which is the correct size so that, when drawn on it, the bitmap data form the image is 300 dpi.
So for example, if you have an image which is 600 pixels by 900 pixels, then in order to get that to be 300 dpi you must make the media size 2 inches by 3 inches, which is 144 by 216 points. Changing the resolution of the pdfwrite device won't affect that at all. Setting -g and -r will alter the media size, but not the resolution of the image, though if you also set -dPDFFitPage then yes it will rescale the image to fit the media, which will alter its resolution.....
I have no idea if your original image was 300 dpi, if it was, and the SVG to PDF conversion maintained that, then you don't need to mess about with media sizes and resolution at all, the pdfwrite device will maintain whatever was there.
As regards the #file syntax, you cannot do this:
-c "[ {ThisPage} << /TrimBox [$0 $1 $2 $3] >> /PUT pdfmark"
in the file supplied via the # comamnd because, as I said, there is no variable replacement in the processing which Ghostscript does on the contents of that file. This is not a bash script.

Why is the following convert command resulting in Segmentation fault?

This is the command I am running (directly from the command line, logged in as root):
/usr/bin/convert '/var/storage/files/drupal/273f09ab5f8671d3c457719c7955063f.jpg' -resize 127x127! -quality '75' '/var/storage/files/drupal/imagecache/artwork_moreart/273f09ab5f8671d3c457719c7955063f.jpg'
The result of the command is just: Segmentation fault
Version of ImageMagic: ImageMagick 6.4.3 2009-02-25
Linux version: SUSE Linux Enterprise Server 11 (x86_64)
This image does exists and I have copied it to my local computer and opened it up with no issue.
Please let me know if there is additional information you need and how to get this information.
Try it with a correct command. The ! needs backslash-escaping, first of all, otherwise it is interpreted by your shell, instead of by convert:
/usr/bin/convert \
'/var/storage/files/drupal/273f09ab5f8671d3c457719c7955063f.jpg' \
-resize 127x127\! -quality '75' \
'/var/storage/files/drupal/imagecache/artwork_moreart/273f09ab5f8671d3c457719c7955063f.jpg'
If this doesn't work, try to surround the argument with single quotes too (like you did with your other arguments:
127x127\! => '127x127\!'
The cause of your problem could also reside outside the convert binary, and be within the specific input JPEG you want process. You can try to rule this out by processing a set different input files. Start with the built-in IM test files logo:, wizard: and netscape::
convert wizard: \
-resize "127x127\!" \
127wiz.jpg
convert logo: \
-resize "127x127\!" \
127log.jpg
convert netscape: \
-resize "127x127\!" \
127net.jpg
Sorry, I cannot reproduce your problem directly here. SLES 11 with IM 6.4.3 is simply too ancient for me.

Repair apparently damaged pdf and reduce file size

I have a PDF file (4.6MB) which was made by combining 6 different PDFs (containing both text and bitmap graphics) using pdftk in Ubuntu 12.04. I wish to compress this file to something close to 2MB without affecting its quality.
I have tried pdftk's "compress" option (couldn't compress it to 2 MB), also tried converting it to ps first and than back to pdf, it gives the following warning:
****Warning: considering '0000000000 XXXXX n' as a free entry.
and then hangs. qpdf also failed saying that the file is damaged.
Could someone help me out?
What result does Ghostscript give you? Try this command:
gs \
-o output.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/screen \
input.pdf
has this pdf file reserved infos? If it has no confidential data it would be interesting to see
anyway many times where qpdf fails, Multivalent works
you can try to use its Compress tool (it also attempts to repair pdf file)
Multivalent
https://rg.to/file/c6bd7f31bf8885bcaa69b50ffab7e355/Multivalent20060102.jar.html
(latest free version with tools included, current has no tools in itself)
java -cp path....to/Multivalent.jar tool.pdf.Compress file.pdf
This works for me to repair the damaged PDF
sudo apt-get install mupdf-tools
mutool clean input.pdf output.pdf

Ghostscript under linux: Times too wide

How to make Times working for printing under linux?
I have debian wheezy linux, ghostscript, cups, mscorefonts installed.
But when i do print, i get Times too wide, comparing to windows one -- letter spacing are too wide.
Any way to fix that problem?
Printing done from same Java applet and on Win and on Lin.
Postscript from Lin variant use Times fonts, postscript from Win variant uses TimesNewRomanPSMT font.
Just replacement font name changes it, but not changes anything in output.
=================
Debian Wheezy, Debian Squeeze, Ubuntu Natty checked as linux.
Most of checks was in Debian Wheezy.
ghostscript:
Installed: 9.02~dfsg-2
sun-java6-jre:
Installed: 6.26-1
cups-pdf printer.
PPD is PDF.ppd:
*PCFileName: "CUPS-PDF.PPD"
*Manufacturer: "Generic"
*Product: "(CUPS v1.1)"
*ModelName: "Generic CUPS-PDF Printer"
*ShortNickName: "Generic CUPS-PDF Printer"
*NickName: "Generic CUPS-PDF Printer"
*1284DeviceID: "MFG:Generic;MDL:CUPS-PDF Printer;DES:Generic CUPS-PDF Printer;CLS:PRINTER;CMD:POSTSCRIPT;"
Print result Comparsion: http://piccy.info/code2/1652248/4b2c3b10f5316f9836496af5501892d1/
I DO have Times New Roman font on linux system! PDF for windows was generated on linux with linux ghostscript from postscript source generated on windows machine.
For example, take a look into right upper corner, where 0401060 written.
Windows postscript code:
%%IncludeResource: font TimesNewRomanPS-BoldMT
F /F1 0 /256 T /TimesNewRomanPS-BoldMT mF
/F1S53 F1 [83 0 0 -83 0 0 ] mFS
F1S53 Ji
4292 333 M (0401060)[42 42 42 42 42 42 0]xS
N 367 367 M 1192 367 I K
N 1667 367 M 2492 367 I K
51282 VM?
linux postscript code:
10.0 29 F
<303430313036> 37.44 526.0 52.0 S
10.0 29 F
<30> 6.24 541.0 62.0 S
N
as you can see, it selects font #29 of size 10.0. Font #29 is
/Times-Bold ISOF
and, worst thing, it already writes two lines -- so problem are somewhere in java<=>cups connector.
==================
"Same Java Applet" is internet-bank application iBank2.
"Times" is substituted by Ghostscript to Nimbus, not to TimesNewRoman:
./Init/Fontmap.GS:/Times-Roman /NimbusRomNo9L-Regu ;
./Init/Fontmap.GS:/Times-Italic /NimbusRomNo9L-ReguItal ;
./Init/Fontmap.GS:/Times-Bold /NimbusRomNo9L-Medi ;
./Init/Fontmap.GS:/Times-BoldItalic /NimbusRomNo9L-MediItal ;
./Init/Fontmap.GS:/TimesNewRoman /TimesNewRomanPSMT ;
./Init/Fontmap.GS:/TimesNewRoman,Bold /TimesNewRomanPS-BoldMT ;
./Init/Fontmap.GS:/TimesNewRoman,Italic /TimesNewRomanPS-ItalicMT ;
./Init/Fontmap.GS:/TimesNewRoman,BoldItalic /TimesNewRomanPS-BoldItalicMT ;
(BTW, are you using Ghostscript on Windows at all, or is your printing there going through a native printer driver?)
On windows i'm print onto PostScript native driver to .ps file.
So it is NOT a Ghostscript problem per se... but it maybe originating from different Java versions + configurations on your Win/Lin systems.
It looks like problem in java on printing, but that doesn't depends on java version -- both have latest java6 installed.
That PostScript most likely generated by your Java applet, and Ghostscript is only the consumer of it when it goes through the printing process.
Normally, i just want to make sure it uses TimesNewRoman font for Times one, not Nimbus.
And i have failed to make this.
ISOF macro generated by printing is:
/ISOF {
dup findfont dup length 1 add dict begin {
1 index /FID eq {pop pop} {D} ifelse
} forall /Encoding ISOLatin1Encoding D
currentdict end definefont
} BD
Here is cut of start files, and generated resulting PDF: http://datacompboy.ru/u/smpl.tar.bz2
If this is so, then copy the Windows fontfile to Linux.
it are already copy of windows file. msttcorefonts are identical to one, distributed with windows.
Since in generated postscript file already 0401060 split to two lines, that means, that java applet are while printing found that font too wide, and split upon generating... So question is -- how to substitute Times font in system so, that java printing will find TimesNewRoman instead of Nimbus, and generate correct output?
From what I see in the screenshot, your Win <--> Lin printing differences...
...do NOT originate in Times <--> TimesNewRomanPSMT differences,
...but rather come from [SomeTimes] <--> [SomeTimesBold] differences in the 2 PostScript output(s)
that is consumed by each printer queue (which on Linux very likely involves a Ghostscript installation). (BTW, are you using Ghostscript on Windows at all, or is your printing there going through a native printer driver?)
So it is NOT a Ghostscript problem per se... but it maybe originating from different Java versions + configurations on your Win/Lin systems.
The fact that your Linux PostScript code seems to make use of the /Times-Bold (ISOF????) font is outside of Ghostscript's responsibility. That PostScript most likely generated by your Java applet, and Ghostscript is only the consumer of it when it goes through the printing process.
It looks to me that this ominous ISOF you mentioned is not part of the fontname, but a PostScript procedure that must be pre-defined elsewhere in the PostScript file and is applied to the /Times-Bold font. It is probably a procedure which re-encodes the original font to ISOLatin1Encoding...
You say you have access to both font files (TimesNewRomanPS-BoldMT on Windows and Times-Bold on Linux). If this is so, then copy the Windows fontfile to Linux. Then, to verify the visual differences between the two fonts, run these two commands on each of the fontfiles:
fntsample \
-f /path/to/Times-fontfile.suffix \
-o Times-fontfile.suffix.pdf \
-l \
> Times-fontfile.suffix.txt
and then
pdfoutline \
Times-fontfile.suffix.pdf \
Times-fontfile.suffix.txt \
Times-fontfile-sample.pdf
The resulting PDF(s), Times-fontfile-sample.pdf, will represent a tabular sample of each glyph contained in the fontfiles, and these will be mapped to the respective Unicode codepoints sections.
You can use these PDFs to reveal even minimal visual discrepancies between the two fonts (but I bet your differences will be rather glaring).
In case you don't have installed pdfoutline and fntsample in your Debian, just run sudo apt-get install fntsample...
Update 2 (taking into account the updated problem description):
datacompboy has now provided a tarball containing these 4 files:
-rw-r--r-- datacompboy/datacompboy 37722 2011-06-22 08:54 smpl/linout.ps
-rw-r--r-- datacompboy/datacompboy 15324 2011-06-22 08:54 smpl/linout.pdf
-rw-r--r-- datacompboy/datacompboy 54422 2011-06-22 08:57 smpl/winout.pdf
-rw-r--r-- datacompboy/datacompboy 99099 2011-06-22 08:56 smpl/winout.ps
With these files, it should be very easy to pinpoint the cause of the problem. If datacompboy can run the Windows-generated PS file on a Linux Ghostscript, like this:
gs winout.ps
and if it renders OK (i.e.: the same as winout.pdf), then there is no problem with the GS font mapping, but a problem with the actual file differences in winout/linout.ps. From there, it should be quite easy to continue the analysis.
Unfortunately, right now I cannot run the test myself.
Update 3:
datacompboy's PDF files linout.pdf and winout.pdf have one huge difference: the Linux version doesn't have the font embedded, while the Windows one has... The consequence is that any posterior consumer of linout.pdf will produce fairly arbitrary results when displaying, printing, converting or processing this file with regard to the font.
So here is another test that I can think of. It checks how much the Linux versions of the fonts used for /Times-Bold (which is substituted by Ghostscript with the real /NimbusRomNo9L-Medi) and /TimesNewRomanPS-BoldMT` do differ in their font metrics.
Create three different PDFs with these Ghostscript commandlines:
a.pdf:
gs \
-o a.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
-c "100 700 moveto \
/TimesNewRoman,Bold findfont \
12 scalefont \
setfont \
(0401060 0401060 0401060 0401060) show \
showpage"
b.pdf:
gs \
-o b.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
-c "100 700 moveto \
/TimesNewRomanPS-BoldMT findfont \
12 scalefont \
setfont \
(0401060 0401060 0401060 0401060) show \
showpage"
c.pdf:
gs \
-o c.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
-c "100 700 moveto \
/Times-Bold findfont \
12 scalefont \
setfont \
(0401060 0401060 0401060 0401060) show \
showpage"
The -dPDFSETTINGS=/prepress parameter should enforce the font embedding into output PDFs. (This is important, otherwise the viewer could use an arbitrary replacement font for displaying the PDF.)
What follows the -c parameter is a little PostScript snippet that provides content for the PDF page.
Files 'a.pdf' and 'b.pdf' should not differ. They only test if the font aliasing between /TimesNewRoman,Bold and /TimesNewRomanPS-BoldMT do indeed work as expectd.
File 'c.pdf' could show slight differences in comparison to a.pdf and b.pdf in the order of a few pixel here and there, but NOT in the tracking of the tested string.
If this test goes as predicted, the different fontfiles, the Fontmap.GS and Ghostscript itself all are OK. Then the problem is only with the way the Linux Java applet produces its output (PS or PDF).

Resources