How to get byte length of a base64 string in node.js? - node.js

I'd like to calculate the size of an image file received as base64 encoded string like:
'...'
in order to make sure that the file is not larger than certain size, say 5MB.
How can I acheive this in node.js?
I've seen similar question here but could not apply the answer in my node app as I get:
SyntaxError: Unexpected token :

You need to remove the data... part
const img = '';
const buffer = Buffer.from(img.substring(img.indexOf(',') + 1));
console.log("Byte length: " + buffer.length);
console.log("MB: " + buffer.length / 1e+6);

Actually, there is not much to it.
If you know the size of the Base64 image all you have to do is divide by 1.37.
As Base64 algorithm is linear the result is also.
For more info see here.
To calculate the string size that you already have you can use this solution:
function byteCount(s) {
return encodeURI(s).split(/%..|./).length - 1;
}
and divide the result by 1.37.

var src = "";
var base64str = src.substring(src.indexOf(',') + 1)
var decoded = atob(base64str);
console.log("FileSize: " + decoded.length);

Related

Writing bytes in a pdf

I am getting a response for an API in form of bytes.
let a = "JSHDHHFHFHHFKFLLFLDMDMDMDMMSKKW==";
I want to write this into a pdf file.
The approach I have taken till now is to convert it to binary using atob library. Then I convert it to a Uint8Array and write to a file using fs.writestream. When the file write completes it gives me an output of unidetified type.
%
1 0 obj
<</Filter/FlateDecode/First 141/N 20/Length 848/Type/ObjStm>>
stream
xUmoâ8þ+óm[U½ø%/ÎiU ÈÂr]º¸ë¢|ð/)`©ý÷7ã´vK·HQ2gÆÏ3À#)ÀcHBd#*P§¤ Î U8GC8q3È_C¦x¦¤øU8Oà3$c¨*Æ/æK?óÝ7÷¸5Á`
íÆ-Ð4?¶Î¬Çï²3Ù=0YØÑ8èm0.ÍÆUî1  ¢ý+3í²©¶Î6"ûº5~º?½»®Fzçôºº¹!ÙO>=¸ÑÜigºù÷¼ÏyDqÇIè?*Ê¥J!âB)ÿ§ ½àøÌÝDÇþ{;ü2¹ê5®¯û¶.'1Í«ÚÊZã#$~4GØÿda0vº®½Íª6H-­àZ&Lo?jõÃáÁIÐ림=¬õªñ+¿T¿oòë³S$±Þ±ð³wzm^ú²ÍNϨJ[Bß6iEði³´eµYíi»þ|#ÜÂþ½©ÐÁx/iãt©ÑÕÌT¯LKNy³µ]½dzHÞÈì¿ì¯Á¢ùï2Ta°<b¬HDNµ¤¹êþø2Äz=WTâ÷hg õr¡æQîI²2xj;÷æÁe[|Ó à±¦#b\:IEÌ,ékvª_]ØÌ´v×,Mû$êCô¯hêgþp»DEäÁ4óàµ#Å¡v$§vDx¤y yR;qè#Q;ByÇíÓ{Z6»£UÛªlsλ
ÜÙ>5»[pÍÎ_§tíO;Û¬u}¢ñm·µYv|mJÓ`)_<×Ç%²½ªZ×<^ôJûÍ\þÒ{£Þð'"u?ÅÅ!\{þÈ~?âEF¡xàBxÏþigX]¿quu&^/ú¶ìŽIüþvZj<§A_½ñ¾ëº5¯ÖÄ.²?ãsÁY_1ñ±Á 1ÚUvó¶
£Ü-Ms1~ÑÛº#Hÿìr$ö¤ÿ²}R
endstream
endobj
22 0 obj```
When I am trying the response on online editor it gives me the write response.
The code I have used till now.
let encodedPDF = JSON.parse(d).Resp_Policy_Document.PDF_BYTES;
var bin = atob(encodedPDF);
var binaryLen = bin.length;
var bytes = new Uint8Array(binaryLen);
for (var i = 0; i < binaryLen; i++)
{
var ascii = bin.charCodeAt(i);
bytes[i] = ascii;
}
let writer=fs.createWriteStream('Last.pdf');
writer.write(bin);
The data you get is Base64 encoded. That's a pretty common way for APIs to pass information. The giveaways? the equal signs at the end, and the use of ASCII uppercase and lowercase letters, numbers, +, and /.
So, you need to decode it that way. Use something like this
const pdfBinary = Buffer.from(a, 'base64');
The contents of this buffer, I guess, are a pdf document. You should write it directly to a file without trying to convert it to a text string.

How to get the right values when I convert hex to ascii character

I'm trying to convert a hexa file to text, using javascript node js.
function hex_to_ascii(str1){
var hex = str1.toString();
var str = '';
for (var n = 0; n < hex.length; n += 2) {
str += String.fromCharCode(parseInt(hex.substr(n, 2), 16));
}
return str;
}
I have a problem concening the extended ASCII charaters, so for example when I try to convert 93 I've get “ instead of ô and when I convert FF I've get ÿ instead of (nbsp) space.
I want to get the same extended charaters as this table: https://www.rapidtables.com/code/text/ascii-table.html
This problem is slightly more complex than it seems at first, since you need to specify an encoding when converting from extended ascii to a string. For example Windows-1252, ISO-8859-1 etc. Since you wish to use the linked table, I'm assuming you wish to use CP437 encoding.
To convert a buffer to string you need a module that will do this for you, converting from a buffer (in a given encoding) to string is not trivial unless the buffer is in a natively supported node.js encoding, e.g. UTF-8, ASCII (7-bit only!), Latin1 etc.
I would suggest using the iconv-lite package, this will convert many types of encoding. Once this is installed the code should look as follows (this takes each character from 0x00 to 0xFF and prints the encoded character):
const iconv = require('iconv-lite');
function hex_to_ascii(hexData, encoding) {
const buffer = Buffer.from(hexData, "hex");
return iconv.decode(buffer, encoding);
}
const testInputs = [...Array(256).keys()];
const encoding = "CP437";
console.log("Decimal\tHex\tCharacter")
for(let input of testInputs) {
console.log([input, input.toString(16), hex_to_ascii(input.toString(16), encoding)].join("\t"));
}

How to shorten UUID V4 without making it non-unique/guessable

I have to generate unique URL part which will be "unguessable" and "resistant" to brute force attack. It also has to be as short as possible :) and all generated values has to be of same length. I was thinking about using UUID V4 which can be represented by 32 (without hyphens) hex chars de305d5475b4431badb2eb6b9e546014 but it's a bit too long. So my question is how to generate something unqiue, that can be represented with url charcters with same length for each generated value and shorter than 32 chars. (In node.js or pgpsql)
v4() will generate a large number which is translated into a hexadecimal string. In Node.js you can use Buffer to convert the string into a smaller base64 encoding:
import { v4 } from 'uuid';
function getRandomName() {
let hexString = v4();
console.log("hex: ", hexString);
// remove decoration
hexString = hexString.replace(/-/g, "");
let base64String = Buffer.from(hexString, 'hex').toString('base64')
console.log("base64:", base64String);
return base64String;
}
Which produces:
hex: 6fa1ca99-a92b-4d2a-aac2-7c7977119ebc
base64: b6HKmakr
hex: bd23c8fd-0f62-49f4-9e51-8b5c97601a16
base64: vSPI/Q9i
UUID v4 itself does not actually guarantee uniqueness. It's just very, very unlikely that two randomly generated UUIDs will clash. That's why they need to be so long - that reduces the clashing chance.
So you can make it shorter, but the shorter you make it, the more likely that it won't actually be unique. UUID v4 is 128 bit long because that is commonly considered "unique enough".
The short-uuid module does just that.
"Generate and translate standard UUIDs into shorter - or just different - formats and back."
It accepts custom character sets (and offers a few) to translate the UUID to and fro.
You can also base64 the uuid which shortens it a bit to 22. Here's a playground.
It all depends on how guessable/unique it has to be.
My suggestion would be to generate 128 random bits and then encode it using base36. That would give you a "nice" URL and it would be unique and probably unguessable enough.
If you want it even shorter you can use base64, but base64 needs to contain two non alphanumeric characters.
This is a fairly old thread, but I'd like to point out the top answer does not produce the results it claims. It will actually produce strings that are ~32 characters long, but the examples claim 8 characters. If you want more compression convert the uuid to base 90 using this function.
Using Base64 takes 4 characters for every 3 bytes, and Hex (Base16) takes 2 characters for each byte. This means that Base64 will have ~67% better storage size than hex, but if we can increase that character/byte ratio we can get even better compression. Base90 gives ever so slightly more compression because of this.
const hex = "0123456789abcdef";
const base90 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%&'()*+-./:<=>?#[]^_`{|}~";
/**
* Convers a Base16 (Hex) represented string to a Base 90 String.
* #param {String} number Hex represented string
* #returns A Base90 representation of the hex string
*/
function convertToBase90(number) {
var i,
divide,
newlen,
numberMap = {},
fromBase = hex.length,
toBase = base90.length,
length = number.length,
result = typeof number === "string" ? "" : [];
for (i = 0; i < length; i++) {
numberMap[i] = hex.indexOf(number[i]);
}
do {
divide = 0;
newlen = 0;
for (i = 0; i < length; i++) {
divide = divide * fromBase + numberMap[i];
if (divide >= toBase) {
numberMap[newlen++] = parseInt(divide / toBase, 10);
divide = divide % toBase;
} else if (newlen > 0) {
numberMap[newlen++] = 0;
}
}
length = newlen;
result = base90.slice(divide, divide + 1).concat(result);
} while (newlen !== 0);
return result;
}
/**
* Compresses a UUID String to base 90 resulting in a shorter UUID String
* #param {String} uuid The UUID string to compress
* #returns A compressed UUID String.
*/
function compressUUID(uuid) {
uuid = uuid.replace(/-/g, "");
return convertToBase90(uuid);
}
Over a few million random uuids this generates no duplicates and the following output:
Lengths:
Avg: 19.959995 Max: 20 Min: 17
Examples:
Hex: 68f75ee7-deb6-4c5c-b315-3cc6bd7ca0fd
Base 90: atP!.AcGJh1(eW]1LfAh
Hex: 91fb8316-f033-40d1-974d-20751b831c4e
Base 90: ew-}Kv&nK?y#~xip5/0e
Hex: 4cb167ee-eb4b-4a76-90f2-6ced439d5ca5
Base 90: 7Ng]V/:0$PeS-K?!uTed
UUID is 36 characters long and you can shorten it to 22 characters (~30%) if you want save ability to convert it back and for it to be url safe.
Here is pure node solution for base64 url safe string:
type UUID = string;
type Base64UUID = string;
/**
* Convert uuid to base64url
*
* #example in: `f32a91da-c799-4e13-aa17-8c4d9e0323c9` out: `8yqR2seZThOqF4xNngMjyQ`
*/
export function uuidToBase64(uuid: UUID): Base64UUID {
return Buffer.from(uuid.replace(/-/g, ''), 'hex').toString('base64url');
}
/**
* Convert base64url to uuid
*
* #example in: `8yqR2seZThOqF4xNngMjyQ` out: `f32a91da-c799-4e13-aa17-8c4d9e0323c9`
*/
export function base64toUUID(base64: Base64UUID): UUID {
const hex = Buffer.from(base64, 'base64url').toString('hex');
return `${hex.substring(0, 8)}-${hex.substring(8, 12)}-${hex.substring(
12,
16,
)}-${hex.substring(16, 20)}-${hex.substring(20)}`;
}
Test:
import { randomUUID } from "crypto";
// f32a91da-c799-4e13-aa17-8c4d9e0323c9
const uuid = randomUUID();
// 8yqR2seZThOqF4xNngMjyQ
const base64 = uuidToBase64(uuid);
// f32a91da-c799-4e13-aa17-8c4d9e0323c9
const uuidFromBase64 = base64toUUID(base64);

Is it possible to calculate the size of a file created from a base64 string only?

Is it possible to calculate the size of a file created from a base64 string? The file type varies. I know it can't be exact but an approximate size would be enough.
I have only been provided the base64 string.
From Wikipedia
the size of the decoded data can be approximated with this formula:
bytes = (string_length(encoded_string) - 814) / 1.37
base64String = "data:image/jpeg;base64......";
var stringLength = base64String.length - 'data:image/png;base64,'.length;
var sizeInBytes = 3 * Math.ceil((stringLength / 4));
var sizeInKb=sizeInBytes/1000;

Checksum Algorithm Producing Unpredictable Results

I'm working on a checksum algorithm, and I'm having some issues. The kicker is, when I hand craft a "fake" message, that is substantially smaller than the "real" data I'm receiving, I get a correct checksum. However, against the real data - the checksum does not work properly.
Here's some information on the incoming data/environment:
This is a groovy project (see code below)
All bytes are to be treated as unsigned integers for the purpose of checksum calculation
You'll notice some finagling with shorts and longs in order to make that work.
The size of the real data is 491 bytes.
The size of my sample data (which appears to add correctly) is 26 bytes
None of my hex-to-decimal conversions are producing a negative number, as best I can tell
Some bytes in the file are not added to the checksum. I've verified that the switch for these is working properly, and when it is supposed to - so that's not the issue.
My calculated checksum, and the checksum packaged with the real transmission always differ by the same amount.
I have manually verified that the checksum packaged with the real data is correct.
Here is the code:
// add bytes to checksum
public void addToChecksum( byte[] bytes) {
//if the checksum isn't enabled, don't add
if(!checksumEnabled) {
return;
}
long previouschecksum = this.checksum;
for(int i = 0; i < bytes.length; i++) {
byte[] tmpBytes = new byte[2];
tmpBytes[0] = 0x00;
tmpBytes[1] = bytes[i];
ByteBuffer tmpBuf = ByteBuffer.wrap(tmpBytes);
long computedBytes = tmpBuf.getShort();
logger.info(getHex(bytes[i]) + " = " + computedBytes);
this.checksum += computedBytes;
}
if(this.checksum < previouschecksum) {
logger.error("Checksum DECREASED: " + this.checksum);
}
//logger.info("Checksum: " + this.checksum);
}
If anyone can find anything in this algorithm that could be causing drift from the expected result, I would greatly appreciate your help in tracking this down.
I don't see a line in your code where you reset your this.checksum.
This way, you should alway get a this.checksum > previouschecksum, right? Is this intended?
Otherwise I can't find a flaw in your above code. Maybe your 'this.checksum' is of the wrong type (short for instance). This could rollover so that you get negative values.
here is an example for such a behaviour
import java.nio.ByteBuffer
short checksum = 0
byte[] bytes = new byte[491]
def count = 260
for (def i=0;i<count;i++) {
bytes[i]=255
}
bytes.each { b ->
byte[] tmpBytes = new byte[2];
tmpBytes[0] = 0x00;
tmpBytes[1] = b;
ByteBuffer tmpBuf = ByteBuffer.wrap(tmpBytes);
long computedBytes = tmpBuf.getShort();
checksum += computedBytes
println "${b} : ${computedBytes}"
}
println checksum +"!=" + 255*count
just play around with the value of the 'count' variable which somehow corresponds to the lenght of your input.
Your checksum will keep incrementing until it rolls over to being negative (as it is a signed long integer)
You can also shorten your method to:
public void addToChecksum( byte[] bytes) {
//if the checksum isn't enabled, don't add
if(!checksumEnabled) {
return;
}
long previouschecksum = this.checksum;
this.checksum += bytes.inject( 0L ) { tot, it -> tot += it & 0xFF }
if(this.checksum < previouschecksum) {
logger.error("Checksum DECREASED: " + this.checksum);
}
//logger.info("Checksum: " + this.checksum);
}
But that won't stop it rolling over to being negative. For the sake of saving 12 bytes per item that you are generating a hash for, I would still suggest something like MD5 which is know to work is probably better than rolling your own... However I understand sometimes there are crazy requirements you have to stick to...

Resources