How can I produce a Base64 encoded PDF that is read by iText's PdfReader - base64

I am reading an existing PDF (INPUT) using iText:
PdfReader reader = new PdfReader(INPUT);
I am using this reader instance to manipulate the PDF, but the end result needs to be Base64 encoded (I need to insert it into a db2 database as a text BLOB). How can I make sure that iText's output is Base64 encoded?

Your question is unclear, or at least ambiguous.
Asking How can I convert PdfReader pdfReader in Base64? doesn't make any sense, and that's why your question is unclear. Your problem is probably that you either have input that is encoded using Base64, or that you want output that you want to have encoded using Base64. That makes your question ambiguous.
If INPUT is String that represents a PDF file encoded using Base64, then you can decode it like this:
import com.itextpdf.text.pdf.codec.Base64;
...
PdfReader reader = new PdfReader(Base64.decode(INPUT));
If INPUT is (the path to) a PDF file that you want to manipulate with as result a PDF that is encoded as a Base64 String, then you can do this like this:
import com.itextpdf.text.pdf.codec.Base64;
PdfReader reader = new PdfReader(INPUT);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, baos);
// do stuff with stamper
stamper.close();
String base64 = Base64.encode(baos.toByteArray());
There may be other ways to produce the Base64 output, e.g. using some kind of Base64OutputStream, but I preferred to use the Base64 class that is shipped with iText.
If you don't need to manipulate the PDF, you don't even need iText. You can simply use the answer to this question: Out of memory when encoding file to base64
UPDATE:
in a comment to this answer, you wrote:
I have this:
byte[] bdata = blob.getBytes(1, (int) blob.length());
InputStream inputStream = blob.getBinaryStream();
String recuperataDaDb = convertStreamToString(inputStream);
byte[] decompilata = Base64.decode(recuperataDaDb);
I want write this "decompilata " in pdf file whit itext.jar
You have two options:
[1.] You don't need iText. You can simply write the byte[] to a file as described here: byte[] to file in Java
FileOutputStream stream = new FileOutputStream(path);
try {
stream.write(bytes);
} finally {
stream.close();
}
[2.] You read the answer to this question:
PdfReader reader = new PdfReader(decompilata);
FileOutputStream fos = new FileOutputStream(pathToFile);
PdfStamper stamper = new PdfStamper(reader, fos);
stamper.close();
I am very surprised by your eagerness to use iText. You really don't need iText to meet your requirement. All you need is an education on how to write Java code to perform some simple I/O.

Related

Is it possible to load ansi encoded string using nodejs

I have large quantity of html files (around 2k).
These html`s are result of conversion from word documents.
The files have some hebrew text inside html tags. I can see the text perfectly using vscode or notepad++ editors.
My goal is to loop through the folder and insert the contents of files into some DB.
Since i have a little knowledge of nodejs - i decided to build the "looping" using node.
Here is where i finished so far:
fs.readdir('./myFolder', function (err, files) {
total = files.length;
let fileArr = []
for(var x=0, l = files.length; x<l; x++) {
const content = fs.readFileSync(`./myFolder/${files[x]}`, 'utf8');
let title = content.match(/<title>(.*?)<\/title>/g).pop()
fileArr.push({id:files[x] , title})
}
});
The problem is: although the text displayed correctly inside editors -when debugging - i can see that "title" variable get strings which consists of question marks
I guess the problem is with file encoding, am i right here?
If so - is there way to decode the string?
P.S. my OS is windows10
Thanks
There are a couple of possibilities here, it may be possible that your input files are in a multibyte encoding (such as utf8 utf16 etc) and your debugger is simply not showing the correct characters due to font restrictions.
I would try writing the title variable to some test file like so:
fs.writeFileSync(`title-test-${x}.txt`, title, "utf8");
And see if the title looks correct in your text editor.
It may also be possible that the files are encoded in an extended ascii encoding such as Windows 1255 or ISO 8859-8. If this is the case, fs.readFileSync will not work correctly since it does not support these encodings (see node.js encoding list)
If the files are encoded using a single-byte extended ascii encoding, it should be possible to convert to a more portable encoding (such as utf8).
I'd recommend the iconv-lite module for this purpose, you can do a lot with it!
For example, to convert from a Windows 1255 file to utf8 you could try:
const iconv = require("iconv-lite");
const fs = require("fs");
// Convert from an encoded buffer to JavaScript string.
const fileData = iconv.decode(fs.readFileSync("./hebrew-win1255.txt"), "win1255");
// Convert from JavaScript string to a buffer.
const outputBuffer = iconv.encode(fileData, "utf8");
// Write output file..
fs.writeFileSync("./hebrew-utf8-output.txt", outputBuffer);

Groovy encodeBase64() returning unexpected result for PNG image file

I am trying to convert a PNG image file to Base64 encoding in Groovy.
Here is my code:
ImageFile = new File("D:/DATA/CustomScript/Logo.png").text;
String encoded = ImageFile.getBytes().encodeBase64().toString();
I get the following as result:
iVBORw0KGgoAAAANSUhEUgAAAIQAAABPCAIAAAClCfqHAAAABGdBTUEAALE/C/xhBQAAAAlwSFlzAAAOwwAADsMBx2+oZAAAAQ1JREFUeF7t1KGRgwAURdFVyHQbSwOkKlrIoECDSwusoYgDcz97396Z/3eGUQxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgzIE2IcxzHP87qu176tJ8T4/X7Lsuz7fu3b6k1BigEpBqQYP2JAigEpBqQYP2JAigEpBqQYP2JAigEpBqQYP2JAigEpBqQYP2JAigEpBqQYP2JAigEpBqQYP2JAnhNj27ZxHN/v9/f7vU5385wYn8/n9XoNwzBN03W6l/P8BwSpsfw4c1/6AAAAAElFTkSuQmCC
The same image when passed through https://www.base64encode.org/ gives this result:
iVBORw0KGgoAAAANSUhEUgAAAIQAAABPCAIAAAClCfqHAAAABGdBTUEAALGPC/xhBQAAAAlwSFlzAAAOwwAADsMBx2+oZAAAAQ1JREFUeF7t1KGRgwAURdFVyHQbSwOkKlrIoECDSwusoYgDc497396Z/3eGUQxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgxIMSDFgBQDUgzIE2IcxzHP87qu176tJ8T4/X7Lsuz7fu3b6k1BigEpBqQYkGJAigEpBqQYkGJAigEpBqQYkGJAigEpBqQYkGJAigEpBqQYkGJAigEpBqQYkGJAigEpBqQYkGJAnhNj27ZxHN/v9/f7vU5385wYn8/n9XoNwzBN03W6l/P8BwSpsfw4c1/6AAAAAElFTkSuQmCC
I have tried to highlight some of the differences. It is clear that both encoded strings are different.
Problem is that I have to pass this image's Base64 encoding to another system and it is accepting the one from https://www.base64encode.org/ but rejecting the one generated by Groovy.
Any ideas what I am doing wrong here?
You are hiting an encoding problem here. Binary data is not character data; character data is effected by encodings. Instead of text use the bytes of the file. E.g.
def f = "/tmp/screenshot-000.png" as File
assert f.bytes.encodeBase64().toString()==("/tmp/encoded_20190208131326.txt" as File).text
Answer from user cfrick was extremely helpful. Unfortunately, it didn't solve my problem. I believe the reason was that I was on an older version of Groovy.
This code eventually solved my problem:
String base64Image = "";
File file = new File(imagePath);
FileInputStream imageInFile = new FileInputStream(file);
byte[] imageData = new byte[file.size()];
imageInFile.read(imageData);
base64Image = Base64.getEncoder().encodeToString(imageData);

Decoding Base 64 In Groovy Returns Garbled Characters

I'm using an API which returns a Base64 encoded file that I want to parse and harvest data from. I'm having trouble decoding the Base64, as it comes back with garbled characters. The code I have is below.
Base64 decoder = new Base64()
def jsonSlurper = new JsonSlurper()
def json = jsonSlurper.parseText(Requests.getInventory(app).toString())
String stockB64 = json.getAt("stock")
byte[] decoded = decoder.decode(stockB64)
println(new String(decoded, "US-ASCII"))
I've also tried println(new String(decoded, "UTF-8")) and this returns the same garbled output. I've pasted in an example snipped of the output for reference.
� ���v���
��W`�C�:�f��y�z��A��%J,S���}qF88D q )��'�C�c�X��������+n!��`nn���.��:�g����[��)��f^���c�VK��X�W_����������?4��L���D�������i�9|�X��������\���L�V���gY-K�^����
��b�����~s��;����g���\�ie�Ki}_������
What am I doing wrong here?
You don't need the Base64 class wherever you took it from. You can simply do stockB64.decodeBase64() to get the decoded byte array. Are you sure what you have there is actual text that is encoded. Usually base64 encoded means that this is some binary like an image. If it is text you could have put it as string in the json simply. Maybe save the resulting byte array to a file and then investigate the file type by content.

Converting compressed data in array of bytes to string

Suppose I have an Array[Byte] called cmp. val cmp = Array[Byte](120, -100).
Now, new String(cmp) gives x�, and (new String(cmp)).getBytes gives Array(120, -17, -65, -67) which isn't equal to the original Array[Byte](120, -100). This byte of -100 was part of an Array[Byte] obtained by compressing some string using Zlib.
Note: These operations were done in Scala's repl.
When you've got arbitrary binary data, never ever try to convert it to a string as if it's actually text data which has been encoded into binary data using a normal encoding such as UTF-8. (Even when you do have text data, always specify the encoding when calling the String constructor or getBytes().) Otherwise it's like trying to load an mp3 into an image editor and complaining when it doesn't look like a proper picture...
Basically, you should probably use base64 for this. There are plenty of base64 encoders around; I like this public domain one as it has a pretty sensible interface. Alternatively, you could use hex - that will be more readable if you want to be able to easily understand the original binary content from the text representation manually, but will take more space (2 characters for each original 1 byte, vs base64's 4 characters for each original 3 bytes).
More like Java, but java.io.ByteArrayInputStream, java.util.zip.InflaterInputStream and java.io.DataInputStream can be used.
import java.io._
val bis = new ByteArrayInputStream(cmp)
val zis = new InflaterInputStream(bis)
val dis = new DataInputStream(zis)
val str = dis.readUTF()
To go backwards,
val bos = new ByteArrayOutputStream()
val zos = new InflaterOutputStream(bos)
val dos = new DataOutputStream(zos)
dos.writeUTF(str)
val cmp = bos.toArray

Convert Binary data from file to readable string

I have binary data stored in a file. I am doing this:
byte[] fileBytes = File.ReadAllBytes(#"c:\carlist.dat");
string ascii = Encoding.ASCII.GetString(fileBytes);
This is giving me following result with lot of invalid characters. What am i doing wrong?
?D{F ?x#??4????? NBR-OF-CARSNUMBER-OF-CARS!"#??? NBR-OF-CARS$%??1y0#123?G??#$ NBR-OF-CARS%45??1y#  NUMBER-OF-CARSd?
hmm... seems like a save was made from a byte buffer where after NBR-OF-CARS was written some numeric data. If you have an access to the code that saves the file could you check if there are numbers over there and if there are - check does the code converts numbers to string before witing the value into the binary stream.

Resources