Graph is too large for cairo-renderer bitmaps - linux

Im trying to use pyreverse to generate UML images for a project source code. When I run the pyreverse command and specify to generate png images, it runs and then after a while, it shows:
dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.271394 to fit
dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.333083 to fit
Then if I open either image, the text is unreadable because it got scaled.
Is there a way to just not scale, and let the image be large size?
Thanks

the option
-T svg
worked for me

Cairo's maximum bitmap size is 32767x32767 pixels, and dot will scale your graph to fit inside that area. As an alternative, you can tell pyreverse to generate PDF files, and use some other tool to convert to PNG, if you really need bitmaps.

in 2019, you can simply output the diagram as svg using:
-o svg

Related

How can you render an SVG to a png in a specific size (python)

I am Working on a small Image comparing script where the reference images are generated as SVGs and the compare images are PNGs.
I can transform the SVG files to PNG (using svglib and renderpm) but canĀ“t specify the size I want them to be generated as(renderscale seems to cut of a part of the picture), but I need to get them to the same size for the compare functions and resizing the pngs nullifyes the whole purpose of vector graphics in itself. Any Ideas?
Regards a python noob

SkiaSharp support for color quantization for PNG files

I'm looking for an all-in-one solution for processing web images
Resizing
Cropping
Save as WEBP / JPEG / PNG
Drawing simple rectangles
Adding text
Reducing colors (quantization) for PNG
The only thing I'm not clear about is PNG quantization. Currently I'm using pngquant which works great, but I'd prefer to do everything in one place.
I see the SkiaSharp has SKImage.Encode() which takes a quality parameter. However there's no explanation as to what it actually is. Will this give me color quantization for PNG files? If not, is there something else in the library to do this?

Pdf to svg is not perfect

I have tried nearly every library to convert pdf to svg, Following are the results of them
gs or ghostscript and imagemagick: The size gets multiplied by 100
pdf2svg and inkscape: The image on the top of the pdf is not at all accurate here are the links to the pdf and the svg.
PDF: https://drive.google.com/open?id=0BxyQR1owWa_pcnhhSk5wQWJGMVk
SVG: https://drive.google.com/open?id=0BxyQR1owWa_pVnhoLVlob1U2d1k
Please suggest me if I am missing something that needs to be done.
The Ghostscript SVG output device is seriously deprecated and no longer supported (or indeed built into the standard Ghostscript binary).
In any event, you need to be aware that PDF is a very rich graphics model, and it is simply not possible to reproduce every possible nuance of a PDF using the SVG graphics model, in particular fonts are a problem, but so is almost any kind of transparency. When that occurs Ghostscript will render the PDF to an image, and insert that into an SVG file. Almost certainly that's why you are seeing the SVG file being considerably larger than the PDF file. You should be able to use the -r switch to control the resolution of the rendering, allowing you to trade off quality for size.
Even if the whole file isn't converted to a bitmap, its possible that large portions of it are, or that the bitmap compression in SVG is less good than for PDF (or GS isn't taking advantage of all the possibilities). FWIW the PDF file uncompressed runs to > 4MB.

Extracting Text from a PDF file with embedded font

I have a PDF file containing some tabular data.
http://dl.dropbox.com/u/44235928/sample_rotate-0.pdf
I have to extract the tabular data from it. I have tried following with no success :
Select the text and paste it to notepad/excel-sheet. (I am getting junk characters)
Used save as text from Acrobat Reader. It is also giving junk characters and not the actual text.
Tried ApachePDFBox command line utility to extract text from PDF. It is also giving junk characters instead of real texts.
Finally I am trying a OCR solution. I am converting the pdf file into .tif images using ImageMagick and getting those images processed by tesseract OCR.
The OCR solution is not very accurate though( about 80% words matched ).
I tried changing density and geometry of the image created from PDF to get better results from tesseract OCR.
convert -rotate 90 -geometry 10000 -depth 8 -density 800 sample.pdf img_800_10000.tif;
tesseract img_800_10000.tif img_800_10000.tif nobatch letters;
I am not sure for what kind of image( density, geometry, monochromatic, sharpen boundary etc) would be best suited for the OCR.
Please suggest what could be the best possible parameters(density,geometry,depth etc) for generating images from a PDF file, so that the tesseract accuracy will increase.
I am open to other( non-ocr ) solutions as well.
In this case I recommend to NOT use ImageMagick for the PDF -> TIFF conversion. Instead, use Ghostscript. Two reasons:
Using Ghostscript directly will give you more control over individual parameters of the conversion.
ImageMagick cannot do that particular conversion itself -- it will call Ghostscript as its 'delegate' anyway, but will not allow you to give all the same fine-grained control that your own Ghostscript command will give you.
Most of the text in the table of your sample PDF is extremely small (I guess, only 4 or 5 pt high). This makes it rather difficult to run a successful OCR unless you increase the resolution considerably.
Ghostscript uses -r72 by default for image format output (such as TIFF). Tesseract works best with r=300 or r=400 -- but only for a font size from 10-12 pt or higher. Therefor, to compensate for the small text size you should make Ghostscript using a resolution of at least 1200 DPI when it renders the PDF to the image.
Also, you'll have to rotate the image so the text displays in the normal reading direction (not bottom -> top).
This is the command which I would try first:
gs \
-o sample.tif \
-sDEVICE=tiffg4 \
-r1200 \
-dAutoRotatePages=/PageByPage \
sample_rotate-0.pdf
You may need to play with variations of the -r1200 parameter (higher or lower) for best results.
Since a comment asked "How to define the geometry of an image when using Ghostscript as we do in convert?", here is an answer:
It does not make sense to define geometry (that is image dimensions) and resolution for a raster image created by Ghostscript at the same time.
Once you convert a vector based page of a given dimension (such as PDF) into a raster image (such as the TIFF G4 format) giving a desired resolution (as done in the other answer), you already indirectly and implicitly also did set the dimension:
The original PDF dimension of your sample file sample_rotate-0.pdf is 1008x612 points.
At a resolution of 72 DPI (the default Ghostscript uses if not given directly, or -r72 in the Ghostscript command if given directly) the image dimensions will be 1008x612 pixels.
At a resolution of 720 DPI (-r720 in the Ghostscript command) the image dimensions will be 10080x6120 pixels.
At a resolution of 1440 DPI (-r1440 in the Ghostscript command of my other answer) the image dimensions will be 20160x12240 pixels.
At a resolution of 1200 DPI (-r1200 in the Ghostscript command) the image dimensions will be 16800x10200 pixels.
At resolution of 1000 DPI (-r1000 in the Ghostscript command) the image dimensions will be 14000x8500 pixels.
At a resolution of 120 DPI (-r120 in the Ghostscript command) the image dimensions will be 1680x1020 pixels.
At resolution of 100 DPI (-r100 in the Ghostscript command) the image dimensions will be 1400x850 pixels.
If you absolutely insist to specify the dimension/geometry for the output image on the Ghostscript commandline (rather than the resolution), you can do so by adding -gNNNNxMMMM -dPDFFitPage to the commandline.
There you can find decoded content of your file: https://docs.google.com/open?id=0B1YEM-11PerqSHpnb1RQcnJ4cFk
A absolutely sure the OCR is the best way to read pdf file, but you can try REGEX-ing the native content. It going to be be the hard and long way.

How to place the same path multiple times at different sizes/coordinates?

I have a path I've created in Illustrator and saved as an SVG.
Now I want to programmatically place it at different sizes and coordinates on a large canvas.
Say I've got this image:
(source: omgtldr.com)
How would I reproduce that same image in different places and sizes in one SVG file, like this:
(source: omgtldr.com)
for example, one version shrunk by 20% at coordinates x,y; another enlarged by 30% at coordinates a,b and so on.
Please assume I'm going to be OK with the programming part, I'm comfortable working with XML files. It's the SVG parts I don't understand.
You need the transform attribute. You can move your paths with translate and resize them with scale.
Better to use the <use> element (transformed) than to copy your path for each instance.

Resources