Writing bytes in a pdf - node.js

I am getting a response for an API in form of bytes.
let a = "JSHDHHFHFHHFKFLLFLDMDMDMDMMSKKW==";
I want to write this into a pdf file.
The approach I have taken till now is to convert it to binary using atob library. Then I convert it to a Uint8Array and write to a file using fs.writestream. When the file write completes it gives me an output of unidetified type.
%
1 0 obj
<</Filter/FlateDecode/First 141/N 20/Length 848/Type/ObjStm>>
stream
xUmoâ8þ+óm[U½ø%/ÎiU ÈÂr]º¸ë¢|ð/)`©ý÷7ã´vK·HQ2gÆÏ3À#)ÀcHBd#*P§¤ Î U8GC8q3È_C¦x¦¤øU8Oà3$c¨*Æ/æK?óÝ7÷¸5Á`
íÆ-Ð4?¶Î¬Çï²3Ù=0YØÑ8èm0.ÍÆUî1  ¢ý+3í²©¶Î6"ûº5~º?½»®Fzçôºº¹!ÙO>=¸ÑÜigºù÷¼ÏyDqÇIè?*Ê¥J!âB)ÿ§ ½àøÌÝDÇþ{;ü2¹ê5®¯û¶.'1Í«ÚÊZã#$~4GØÿda0vº®½Íª6H-­àZ&Lo?jõÃáÁIÐ림=¬õªñ+¿T¿oòë³S$±Þ±ð³wzm^ú²ÍNϨJ[Bß6iEði³´eµYíi»þ|#ÜÂþ½©ÐÁx/iãt©ÑÕÌT¯LKNy³µ]½dzHÞÈì¿ì¯Á¢ùï2Ta°<b¬HDNµ¤¹êþø2Äz=WTâ÷hg õr¡æQîI²2xj;÷æÁe[|Ó à±¦#b\:IEÌ,ékvª_]ØÌ´v×,Mû$êCô¯hêgþp»DEäÁ4óàµ#Å¡v$§vDx¤y yR;qè#Q;ByÇíÓ{Z6»£UÛªlsλ
ÜÙ>5»[pÍÎ_§tíO;Û¬u}¢ñm·µYv|mJÓ`)_<×Ç%²½ªZ×<^ôJûÍ\þÒ{£Þð'"u?ÅÅ!\{þÈ~?âEF¡xàBxÏþigX]¿quu&^/ú¶ìŽIüþvZj<§A_½ñ¾ëº5¯ÖÄ.²?ãsÁY_1ñ±Á 1ÚUvó¶
£Ü-Ms1~ÑÛº#Hÿìr$ö¤ÿ²}R
endstream
endobj
22 0 obj```
When I am trying the response on online editor it gives me the write response.
The code I have used till now.
let encodedPDF = JSON.parse(d).Resp_Policy_Document.PDF_BYTES;
var bin = atob(encodedPDF);
var binaryLen = bin.length;
var bytes = new Uint8Array(binaryLen);
for (var i = 0; i < binaryLen; i++)
{
var ascii = bin.charCodeAt(i);
bytes[i] = ascii;
}
let writer=fs.createWriteStream('Last.pdf');
writer.write(bin);

The data you get is Base64 encoded. That's a pretty common way for APIs to pass information. The giveaways? the equal signs at the end, and the use of ASCII uppercase and lowercase letters, numbers, +, and /.
So, you need to decode it that way. Use something like this
const pdfBinary = Buffer.from(a, 'base64');
The contents of this buffer, I guess, are a pdf document. You should write it directly to a file without trying to convert it to a text string.

Related

Decode value of base64 string in different language gives different output

I have a base64 string like this
String value = "fefWUeQvPgBe/9QaG/RdPnn9PrzQK2VhVwBzAIr7eei9PQrZA2/sXTA/2SCodnTSJn4Lk+ve5kuPGjco4ljYrjNTsrKBAjN6APSHn0BqBce2lOZbm/z29U6j7j79niPbYl/UIc0VTjc0IgRhmNLn1eVvMTuoaGhlwlxUf/+xenC4NmEM2A6y5/DNRheNw6OrmHik/kowpWGQsRNFyXJ2VtzE54nqs9naePBkRlWna/oqBxzA/txtHXn8h/9xTT2caozcU5/R9JayFZq7ZeclzGs2DAACr1TyQwEb9JJpBXr04Zu4rlWLtnSbyflyK3lnSAocma0L6ENnCZoMiN8gUg=="
I used this method to decode string in java
Base64Utils.decode(value.getBytes())
output:125,-25,-42,81,-28,47,62,0,94,-1,-44,26,27,-12,93,62,121,-3,62,-68,-48,43,101,97,87,0,115,0,-118,-5,121,-24,-67,61,10,-39,3,111,-20,93,48,63,-39,32,-88,118,116,-46,38,126,11,-109,-21,-34,-26,75,-113,26,55,40,-30,88,-40,-82,51,83,-78,-78,-127,2,51,122,0,-12,-121,-97,64,106,5,-57,-74,-108,-26,91,-101,-4,-10,-11,78,-93,-18,62,-3,-98,35,-37,98,95,-44,33,-51,21,78,55,52,34,4,97,-104,-46,-25,-43,-27,111,49,59,-88,104,104,101,-62,92,84,127,-1,-79,122,112,-72,54,97,12,-40,14,-78,-25,-16,-51,70,23,-115,-61,-93,-85,-104,120,-92,-2,74,48,-91,97,-112,-79,19,69,-55,114,118,86,-36,-60,-25,-119,-22,-77,-39,-38,120,-16,100,70,85,-89,107,-6,42,7,28,-64,-2,-36,109,29,121,-4,-121,-1,113,77,61,-100,106,-116,-36,83,-97,-47,-12,-106,-78,21,-102,-69,101,-25,37,-52,107,54,12,0,2,-81,84,-14,67,1,27,-12,-110,105,5,122,-12,-31,-101,-72,-82,85,-117,-74,116,-101,-55,-7,114,43,121,103,72,10,28,-103,-83,11,-24,67,103,9,-102,12,-120,-33,32,82,
then I used this method to decode string in nodejs
Buffer.from(value, 'base64')
output:125,231,214,81,228,47,62,0,94,255,212,26,27,244,93,62,121,253,62,188,208,43,101,97,87,0,115,0,138,251,121,232,189,61,10,217,3,111,236,93,48,63,217,32,168,118,116,210,38,126,11,147,235,222,230,75,143,26,55,40,226,88,216,174,51,83,178,178,129,2,51,122,0,244,135,159,64,106,5,199,182,148,230,91,155,252,246,245,78,163,238,62,253,158,35,219,98,95,212,33,205,21,78,55,52,34,4,97,152,210,231,213,229,111,49,59,168,104,104,101,194,92,84,127,255,177,122,112,184,54,97,12,216,14,178,231,240,205,70,23,141,195,163,171,152,120,164,254,74,48,165,97,144,177,19,69,201,114,118,86,220,196,231,137,234,179,217,218,120,240,100,70,85,167,107,250,42,7,28,192,254,220,109,29,121,252,135,255,113,77,61,156,106,140,220,83,159,209,244,150,178,21,154,187,101,231,37,204,107,54,12,0,2,175,84,242,67,1,27,244,146,105,5,122,244,225,155,184,174,85,139,182,116,155,201,249,114,43,121,103,72,10,28,153,173,11,232,67,103,9,154,12,136,223,32,82
The java output is what I really want to get, why its different?
How can I correctly get decoded value in nodejs
Base64Utils.decode returns a signed 8 bit value in Java. Buffer.from returns an unsigned 8 bit value in Nodejs. While both return 8 bit (byte) values, the Java method interprets the high order bit as a negative number. Nodejs is unsigned.
var value = 'fefWUeQvPgBe/9QaG/RdPnn9PrzQK2VhVwBzAIr\
7eei9PQrZA2/sXTA/2SCodnTSJn4Lk+ve5kuPGj\
co4ljYrjNTsrKBAjN6APSHn0BqBce2lOZbm/z29\
U6j7j79niPbYl/UIc0VTjc0IgRhmNLn1eVvMTuo\
aGhlwlxUf/+xenC4NmEM2A6y5/DNRheNw6OrmHi\
k/kowpWGQsRNFyXJ2VtzE54nqs9naePBkRlWna/\
oqBxzA/txtHXn8h/9xTT2caozcU5/R9JayFZq7Z\
eclzGs2DAACr1TyQwEb9JJpBXr04Zu4rlWLtnSb\
yflyK3lnSAocma0L6ENnCZoMiN8gUg=='
buffervalue = Buffer.from(value, 'base64');
for (i=0; i < buffervalue.length; i++) {
y = buffervalue[i];
if (y > 127) {
y = -(256 - y);
}
console.log(y);
}

How to get the right values when I convert hex to ascii character

I'm trying to convert a hexa file to text, using javascript node js.
function hex_to_ascii(str1){
var hex = str1.toString();
var str = '';
for (var n = 0; n < hex.length; n += 2) {
str += String.fromCharCode(parseInt(hex.substr(n, 2), 16));
}
return str;
}
I have a problem concening the extended ASCII charaters, so for example when I try to convert 93 I've get “ instead of ô and when I convert FF I've get ÿ instead of (nbsp) space.
I want to get the same extended charaters as this table: https://www.rapidtables.com/code/text/ascii-table.html
This problem is slightly more complex than it seems at first, since you need to specify an encoding when converting from extended ascii to a string. For example Windows-1252, ISO-8859-1 etc. Since you wish to use the linked table, I'm assuming you wish to use CP437 encoding.
To convert a buffer to string you need a module that will do this for you, converting from a buffer (in a given encoding) to string is not trivial unless the buffer is in a natively supported node.js encoding, e.g. UTF-8, ASCII (7-bit only!), Latin1 etc.
I would suggest using the iconv-lite package, this will convert many types of encoding. Once this is installed the code should look as follows (this takes each character from 0x00 to 0xFF and prints the encoded character):
const iconv = require('iconv-lite');
function hex_to_ascii(hexData, encoding) {
const buffer = Buffer.from(hexData, "hex");
return iconv.decode(buffer, encoding);
}
const testInputs = [...Array(256).keys()];
const encoding = "CP437";
console.log("Decimal\tHex\tCharacter")
for(let input of testInputs) {
console.log([input, input.toString(16), hex_to_ascii(input.toString(16), encoding)].join("\t"));
}

How to get byte length of a base64 string in node.js?

I'd like to calculate the size of an image file received as base64 encoded string like:
'data:image/png;base64,aBdiVBORw0fKGgoAAA...'
in order to make sure that the file is not larger than certain size, say 5MB.
How can I acheive this in node.js?
I've seen similar question here but could not apply the answer in my node app as I get:
SyntaxError: Unexpected token :
You need to remove the data... part
const img = 'data:image/png;base64,aBdiVBORw0fKGgoAAA';
const buffer = Buffer.from(img.substring(img.indexOf(',') + 1));
console.log("Byte length: " + buffer.length);
console.log("MB: " + buffer.length / 1e+6);
Actually, there is not much to it.
If you know the size of the Base64 image all you have to do is divide by 1.37.
As Base64 algorithm is linear the result is also.
For more info see here.
To calculate the string size that you already have you can use this solution:
function byteCount(s) {
return encodeURI(s).split(/%..|./).length - 1;
}
and divide the result by 1.37.
var src = "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7";
var base64str = src.substring(src.indexOf(',') + 1)
var decoded = atob(base64str);
console.log("FileSize: " + decoded.length);

How to convert saved text file encoding to UTF8?

recently i saved a text file on my computer but when i open it again i saw some strings like:
"˜ÌÇí ÍÑÝã ÚÌíÈå¿"
now i want to know is it possible to reconvert it to the original text (UTF8)?
i try this codes but it doesn't works
string tempStr="˜ÌÇí ÍÑÝã ÚÌíÈå¿";
Encoding ANSI = Encoding.GetEncoding(1256);
byte[] ansiBytes = ANSI.GetBytes(tempStr);
byte[] utf8Bytes = Encoding.Convert(ANSI, Encoding.UTF8, ansiBytes);
String utf8String = Encoding.UTF8.GetString(utf8Bytes);
You can use something like:
string str = Encoding.GetEncoding(1256).GetString(Encoding.GetEncoding("iso-8859-1").GetBytes(tempStr))
The string wasn't really decoded... Its bytes where simply "enlarged" to char, with something like:
byte[] bytes = ...
char[] chars = new char[bytes.Length];
for (int i = 0; i < bytes.Length; i++)
{
chars[i] = bytes[i];
}
string str = new string(chars);
Now... This transformation is the same that is done by the codepage ISO-8859-1. So I could simply have done the reverse, or I could have used that codepage to do it for me, I selected the second one.
Encoding.GetEncoding("iso-8859-1").GetBytes(tempStr)
this gave me the original byte[]
Then I've done some tests and it seems that the text in the beginning wasn't UTF8, it was in codepage 1256, that is an arabic codepage. So I
string str = Encoding.GetEncoding(1256).GetString(...);
The only thing, the ˜ doesn't seem to be part of the original string.
There is another possibility:
string str = Encoding.GetEncoding(1256).GetString(Encoding.GetEncoding(1252).GetBytes(tempStr));
The codepage 1252 is the codepage used in the USA and in a big part of Europe. If you have a Windows configured to English, there is a good chance it uses the 1252 as the default codepage. The result is slightly different than using the iso-8859-1

Problems when processing bytes in perl

I am working on a script that will encrypt a file using RSA. It reads and encrypts 1 byte at a time. Encrypting and decrypting normal .txt files works well, but when encrypting and then decrypting binary files (e.g. .gif) they come out corrupted.
This is done by the encryptFile and decryptFile sub.
I am using the IO::File module to open files and add binmode(":raw") so that when a byte is read it is interpreted as a byte and not text, so this can't be the problem.
When encrypting the bytes I first use my bytesToBigint() sub to translate the byte into an integer. Then I use my rsa::encrypt() sub to encrypt the integer. Now the encrypted integer will be much larger than 1 byte so I have to represent it in multiple bytes.
I do this in the sub bigintToBytes() which basically splits an integer into multiple bytes and stores them in a string. The string is returned and then written to the file.
For example: if bigintToBytes(16739) is called then the string 'Ac' is returned because
Starting with 16739 (dec)
→ 0100000101100011 (binary)
→ 01000001|01100011
→ 65|99 (dec)
and chr(65) = 'A' and chr(99) = 'c' → 'Ac'.
This sub may be the cause, but why? is it because I am storing bytes in a string? When calling the read($buf, $bufsize) sub on open files, using the IO::File module, it also stores bytes in a string, which works.
I would be really thankfull if you point out what I am missing.
Here is the bigintToBytes sub
sub bigintToBytes {
use bignum;
use bytes;
my $bigint = shift;
my $bits_in_int = length_in_bytes($bigint)*8;
my $bytes = '';
my $new_byte = 0;
my $count = 0;
while ($bits_in_int > 0){
# add first bit in bigint to the new byte
$new_byte = ($new_byte << 1)|($bigint >> ($bits_in_int-1));
# remove the first bit in bigint
$bigint = $bigint & (2**($bits_in_int)-1);
$bits_in_int--;
$count++;
if ($count == 8){
$bytes = $bytes.chr($new_byte);
$new_byte = 0;
$count = 0;
}
}
return $bytes;
}
I have also added use bytes; to the encryptFile and decryptFile subs if it matter.
I don't feel it is necessary for me to post the bytesToBigint, encryptFile and decryptFile subs, because bytesToBigint basically does the reverse and encryptFile and decryptFile just read the bytes and process them using these function.
Edit:
Here is the code snippet in encryptFile sub. This is where bytesToBigint and encryptFile are used and written to the file.
my $bufsize = 1;
my $buf = '';
while ( $file->read($buf, $bufsize)){
my $msg = myMath::bytesToBigint($buf);
my $enc_msg = rsa::encrypt($n, $enc_key, $msg);
my $enc_msg_as_chars = myMath::bigintToBytes($enc_msg);
#in case the encrypted unit is to small
while (length($enc_msg_as_chars) < $min_bytes){
$enc_msg_as_chars = chr(0).$enc_msg_as_chars;
}
$newFile->print($enc_msg_as_chars);
}
The problem in bigintToBytes is that it returns an empty string when input integer is zero. So I added
if($x == 0){
return chr(0);
}
and the problem is solved!

Resources