I have some existing PDF files and what I want is to highlight some content by overlaying circles or straight lines. I've looked at some NodeJS PDF libraries but couldn't find a solution (some libraries allow creating a PDF from scratch and draw into it; other libraries can modify existing PDFs, but do not support drawing).
A (Linux / OSX) command line solution (e.g. using ImageMagick or some other library) would be perfectly fine, too.
Edit I've since found out that with Image/GraphicsMagick I can in fact do sth. like gm convert -draw "rectangle 20,20 150,100" xxx.pdf[7] xxx2.pdf, but this (1) either draws on all pages or else only on a single one, but then the resulting PDF will only contain that page; (2) the output PDF will contain a bitmap image where I would prefer a PDF with text content.
Edit I've just found HummusJS which is a NodeJS library to manipulate PDF files via declarative JSON objects. Unfortunately, apart from the scrace documentation, the unwieldy API (see below), the tests fail consistently and across the board with Unable to create PDF file, make sure that output file target is available.
completely OT not sure what it is that makes people think such utterly obfuscated APIs are better than simple ones:
var settings = {modifiedFilePath:'./output/BasicJPGImagesTestPageModified.pdf'}
var pdfWriter = hummus.createWriterToModify('./TestMaterials/BasicJPGImagesTest.PDF',settings);
var pageModifier = new hummus.PDFPageModifier(pdfWriter,0);
pageModifier.startContext().getContext().writeText('Test Text', ...
...
var copyingContext = inPDFWriter.createPDFCopyingContextForModifiedFile();
var thirdPageID = copyingContext.getSourceDocumentParser().getPageObjectID(2);
var thirdPageObject = copyingContext.getSourceDocumentParser().parsePage(2).getDictionary().toJSObject();
var objectsContext = inPDFWriter.getObjectsContext();
objectsContext.startModifiedIndirectObject(thirdPageID);
var modifiedPageObject = inPDFWriter.getObjectsContext().startDictionary();
A couple of helpers with HummusJS, to assist with what you are trying to do:
Adding content to existing pages - https://github.com/galkahana/HummusJS/wiki/Modification#adding-content-to-existing-pages
Draw shapes - https://github.com/galkahana/HummusJS/wiki/Show-primitives
using both, this is how to add a circle at 'centerx,centery' with 'radius' and border width of 1, to the first page of 'myFile.pdf'. The end result, in this case will be placed in 'modifiedCopy.pdf':
var pdfWriter = hummus.createWriterToModify(
'myfile.pdf',
{modifiedFilePath:'modifiedCopy.pdf'});
var pageModifier = new hummus.PDFPageModifier(pdfWriter,0);
var cxt = pageModifier.startContext().getContext();
cxt.drawCircle(
centerX,
centerY,
radius,
{
type:stroke,
width:1,
color:'black'
});
pageModifier.endContext().writePage();
pdfWriter.end();
General documentation - https://github.com/galkahana/HummusJS/wiki
If the tests fail, check that an "output" folder exists next to the script being executed, and that there are permissions to write there.
Related
The code below pulls in text from a file. I want to pull all the text from this file, and the text from this source file is formatted in bullet points.
var autoPay = DocumentApp.openById("[file ID]");
autoPay_text = autoPay.getBody().getText();
body.replaceText("{{AddedFeatures}}",autoPay_text);
How do I preserve the bullet points format when the text is placed in the destination file? Is there a method for this, or do I have to do something dreadful like creating an array?
I'm trying to read data over Tesseract from image captured by webcamera. Here is example of used image:
I'm working on nodejs server, and I tried a lot of technique in Jimp including doing invert/grayscale, using sharpening to image, or fiiltering specific colors /yellow/blue/ ... after all I build separated docker container using opencv4nodejs and apply few techniques to extract text from that image.
I need mostly big texts (so small one are not neccessary /also are not sharp on this image/). So I applied this:
const src = cv.imread('./970f5b45-9f24-41d5-91f0-ef3f8b9d8914.jpeg');
let src2 = src.cvtColor(cv.COLOR_BGR2GRAY)
let dst = src2.adaptiveThreshold(255, cv.ADAPTIVE_THRESH_GAUSSIAN_C, cv.THRESH_BINARY, 12, 2);
let dst2 = dst.morphologyEx(cv.MORPH_OPEN)
After that I have this result, which is almost ready for reading by OCR, problem is a lot of dots in that image. Is there any chance to remove that dots, but keep quality of result (readable texts) in opencv, or other technique?
Result is right now:
Is it possible to extract just texts from that result? If I use this result in ocr by tesseract, it takes really a long time to extract text, and there is a huge amount of weird characters (probably because of dots/shapes).
I want to add a SVG image to PdfSignatureAppearance. The method setSignatureGraphic has an ImageData parameter now in iText7. I couldn't find a way to create an imageData from SVG because ImageDataFactory is not supporting this format.
Can you please guide me on how to do that?
Note that with iText5 I was able to add svg after converting it to PDF and import it to a PDFTemplate then create an image after instantiate the PDFTemplate. setSignatureGraphic was accepting com.itextpdf.text.Image as parameter
Your question could be split into 2 more precise and simple ones:
How to process an SVG with iText?
How to create an ImageData instance out of the result of point 1?
As for question 1: one can use SvgConverter class (part of iTextCore's svg module). Unfortunately there are only PDF-related methods there: an SVG could be converted either to Image (class of layout module), or to PdfFormXObject (again PDF-related) or to a PDF file.
// to PDF
SvgConverter.convertToImage(new FileInputStream(sourceFolder + "your_svg.svg"), signer.getDocument()); // the mentioned `signer` is the instance of PdfSigner which you use to sign the document
// to Image
SvgConverter.convertToImage(new FileInputStream(sourceFolder + "your_svg.svg"), new File(destinationFolder + "svgAsPdf.pdf"));
As for question 2, there are several answers:
a) Suppose that you want to use this Image as the PdfSignatureAppearance's graphics data. For now the class doesn't provide a convenient setter, however, you could use some low level methods - either getLayer0 or getLayer2 to get the signature's background or foreground. They are represented by PDfFormXObject, hence you can use Canvas to add your image to them:
Image svg = SvgConverter.convertToImage(new FileInputStream(sourceFolder + "your_svg.svg"), signer.getDocument());
Canvas canvas = new Canvas(appearance.getLayer0(), signer.getDocument());
canvas.add(svg);
canvas.close();
b) Suppose that your goal is to use the rendered bitmap as the PdfSignatureAppearance's graphics data. Then there is a specific iText product - pdfRender - which converts PDF files to images. The following code could be applied:
PdfToImageRenderer.renderPdf(new File(destinationFolder + "svgAsPdf.pdf"), new File(folderForTheResultantImage));
Now you can create an ImageData instance out of the resultant image file (by default a PDF is converted to a series of images with the format "pdfnamePAGE_NUMBER.jpg", but one could customize either the name or the output image format). In your case the PDF consist of just one page (which represents the converted SVG) and its name is "image1.jpg". The rest is obvious:
appearance.setSignatureGraphic(ImageDataFactory.create(destinationFolder + "image1.jpg"));
I'm a noob PyQt5 user following a tutorial and I'm confused how I might extend the sample code below.
The two handlers canInsertFromMimeData and insertFromMimeData Qt5 methods accept an image mime datatype dragged and dropped onto document (that works great). They both receive a signal parameter source which receives a QMimeData object.
However, If I try to paste an image copied from the Windows clipboard into the document it just crashes as there is no handler for this.
Searching the Qt5 documentation at https://doc.qt.io/qt-5/qmimedata.html just leads me to further confusion as I'm not a C++ programmer and I'm using Python 3.x and PyQt5 to do this.
How would I write a handler to allow an image copied to the clipboard to be pasted into the document directly?
class TextEdit(QTextEdit):
def canInsertFromMimeData(self, source):
if source.hasImage():
return True
else:
return super(TextEdit, self).canInsertFromMimeData(source)
def insertFromMimeData(self, source):
cursor = self.textCursor()
document = self.document()
if source.hasUrls():
for u in source.urls():
file_ext = splitext(str(u.toLocalFile()))
if u.isLocalFile() and file_ext in IMAGE_EXTENSIONS:
image = QImage(u.toLocalFile())
document.addResource(QTextDocument.ImageResource, u, image)
cursor.insertImage(u.toLocalFile())
else:
# If we hit a non-image or non-local URL break the loop and fall out
# to the super call & let Qt handle it
break
else:
# If all were valid images, finish here.
return
elif source.hasImage():
image = source.imageData()
uuid = hexuuid()
document.addResource(QTextDocument.ImageResource, uuid, image)
cursor.insertImage(uuid)
return
super(TextEdit, self).insertFromMimeData(source)
code source: https://www.learnpyqt.com/examples/megasolid-idiom-rich-text-editor/
I was exactly in the same position as you. I am also new to Python, so there might be mistakes.
The variable uuid in document.addResource(QTextDocument.ImageResource, uuid, image) is not working. It should be a path -> QUrl(uuid).
Now you can insert the image. However, because the path to an image from the clipboard is changing, it would be better to use a different path, for example to the directory where you are also saving the files.
Also be aware that the user has to select the file type when saving (.html)
For my own project I am going to print the file as pdf. That way you dont have to worry about paths to images ^-^
I got around this by converting to base64 inline embedding of the images, then no resource files as it is all in one file.
I'm writing tests in JScript in TestComplete. I need to make a screenshot of a web page element, and save it to my desktop as a PNG file.
I tried this code:
var MyPicture = WebPage.SomeLocation.Picture();
MyPicture.SaveToFile("C:\Desktop");
which doesn't seem to be working, and I can't seem to figure out why. My program doesn't crash or anything, it simply doesn't save the picture. What am I doing wrong?
SaveToFile needs a full name of the image to create, including a path. Remember that in JScript you must double the backslashes in paths.
To get the desktop folder path, you can use the SpecialFolders property.
var MyPicture = WebPage.SomeLocation.Picture();
var strImageName = "MyPicture.png";
// Get the Desktop folder path
var strDesktop = Sys.OleObject("WScript.Shell").SpecialFolders("Desktop");
// Build the full path to the image
var strPath = aqFileSystem.IncludeTrailingBackSlash(strDesktop) + strImageName;
MyPicture.SaveToFile(strPath);