baseurl64 buffer decoding

baseurl64 buffer decoding - node.js

Can someone explain this behavior?
Buffer.from('5d9RAjZ2GCob-86_Ql', 'base64url').toString('base64url')
// 5d9RAjZ2GCob-86_Qg
Please take a close look at the last character l - g

Your string is 18 characters long, With 6 bits encoded in each character it means the first 16 characters represent 96 bits (12 bytes) and the last two represent one byte plus 4 unused bits. Only the first two bits of the last character are significant here. g is 100000, l is 100101. As the last 4 characters are not used, g is just the first choice for the two bits 1 0.
So for any character in the range between g and v, you would get a g when you convert it back to Base64Url.
See https://en.wikipedia.org/wiki/Base64#Base64_table_from_RFC_4648

Related

How does encode and decode 64 figure out that the last few zeros are mere padding?

https://learn.microsoft.com/en-us/dotnet/api/system.convert.tobase64string?view=net-5.0
It says
If an integral number of 3-byte groups does not exist, the remaining
bytes are effectively padded with zeros to form a complete group. In
this example, the value of the last byte is hexadecimal FF. The first
6 bits are equal to decimal 63, which corresponds to the base-64 digit
"/" at the end of the output, and the next 2 bits are padded with
zeros to yield decimal 48, which corresponds to the base-64 digit,
"w". The last two 6-bit values are padding and correspond to the
valueless padding character, "=".
Now,
Imagine that the byte array I send is
0
So, only one byte, namely 0
That one byte will be padded right into 000 right?
So now, we will have something like 0=== as the encoding because it takes 4 characters in base 64 encoding to encode 3 bytes.
Now, we gonna decode that.
How do we know that the original byte isn't 00, or 000, but just 0?
I must be missing something here.

So now, we will have something like 0=== as the encoding
3 padding characters is illegal. This would mean 6 bit plus padding.
And then 0 as a byte value is A in Base64, so it would be AA==.
So the first A has the first 6 bits of the 0 byte, the second A contributes the 2 remaining 0 bits for your byte, and then there are just 4 0 bits plus the padding left, not enough for a second byte.
How do we know that the original byte isn't 00, or 000, but just 0?
AA== has only 12 bits (6 bits per character) so it can only encode 1 Byte => 0
AAA= has 18 bits, enough for 2 bytes => 00
AAAA has 24 bits = 3 bytes => 000

what will be the dp and transitions in this problem

Vasya has a string s of length n consisting only of digits 0 and 1. Also he has an array a of length n.
Vasya performs the following operation until the string becomes empty: choose some consecutive substring of equal characters, erase it from the string and glue together the remaining parts (any of them can be empty). For example, if he erases substring 111 from string 111110 he will get the string 110. Vasya gets ax points for erasing substring of length x.
Vasya wants to maximize his total points, so help him with this!
https://codeforces.com/problemset/problem/1107/E
i was trying to get my head around the editorial,but couldn't understand it... can anyone tell an easy way to do it?
input:
7
1101001
3 4 9 100 1 2 3
output:
109
Explanation
the optimal sequence of erasings is: 1101001 → 111001 → 11101 → 1111 → ∅.

Here, we consider removing prefixes instead of substrings. Why?
We try to remove a consecutive prefix of a particular state which is actually a substring in the main string. So, our DP states will be start index, end index, prefix length.
Let's consider an example str = "1010110". Here, initially start=0, end=7, and prefix=1(the first '1' will be the only prefix now). we iterate over all the indices in the current state except the starting index and check if str[i]==str[start]. Here, for example, str[4]==str[0]. Now we divide the string into "010" with prefix=1(010) && "110" with prefix=2(1010110). These two are now two individual subproblems. So, when there remains a string with length 1, we return aprefix.
Here is my code.

How to count binary sequence in binary number in Python?

I would like to count '01' sequence in 5760 binary bits.
First, I would like to combine several binary numbers then count # of '01' occurrences.
For example, I have 64 bits integer. Say, 6291456. Then I convert it into binary. Most significant 4 bits are not used. So I'll get 60 bits binary 000...000011000000000000000000000
Then I need to combine(just put bits together since I only need to count '01') first 60 bits + second 60 bits + ...so 96 of 60 bits are stitched together.
Finally, I want to count how many '01' appears.
s = binToString(5760 binary bits)
cnt = s.count('01');

num = 6291226
binary = format(num, 'b')
print(binary)
print(binary.count('01'))
If I use number given by you i.e 6291456 it's binary representation is 11000000000000000000000 which gives 0 occurrences of '01'.
If you always want your number to be 60 bits in length you can use
binary = format(num,'060b')
It will add leading 0 to make it of given length

Say that nums is your list of 96 numbers, each of which can be stored in 64 bits. Since you want to throw away the most 4 significant bits, you are really taking the number modulo 2**60. Thus, to count the number of 01 in the resulting string, using the idea of #ShrikantShete to use the format function, you can do it all in one line:
''.join(format(n%2**60,'060b') for n in nums).count('01')

Explain the number of bits in a hash value that features both numbers and letters

I need some help understanding this concept:
If I have a 256-bit hash, the value is essentially a 64-character long string. This is because each character is 4-bits long (64*4 = 256), correct? However, along with numbers letters are also used in hash values, and letters are 8-bits long. Doesn't a 64-character long hash key that features letters along with numbers ultimately create a hash value that is greater than 256-bits?
Take this hash value for example: 7833dc6e82e9378117bcb03128ac8fdd95d9073161ebc963783b3010dd847ff3
It is 64-characters long, but the letter d is 8-bits long rather than 4. So how does this hash count as 256-bits?
Thank you for your help!

The letters aren't really letters. You've probably noticed that the only included alphabet characters are A-F. This is because the hash is using base 16 (hexadecimal) numbering.
Unlike base 10 where the valid characters are 0-9, in base 16, there are sixteen valid characters: 0 1 2 3 4 5 6 7 8 9 A B C D E F. 16 = 2^4, so you need 4 bits for each character.

Subsequences whose sum of digits is divisible by 6

Say I have a string whose characters are nothing but digits in [0 - 9] range. E.g: "2486". Now I want to find out all the subsequences whose sum of digits is divisible by 6. E.g: in "2486", the subsequences are - "6", "246" ( 2+ 4 + 6 = 12 is divisible by 6 ), "486" (4 + 8 + 6 = 18 is divisible by 6 ) etc. I know generating all 2^n combinations we can do this. But that's very costly. What is the most efficient way to do this?
Edit:
I found the following solution somewhere in quora.
int len,ar[MAXLEN],dp[MAXLEN][MAXN];
int fun(int idx,int m)
{
if(idx==len)
return (m==0);
if(dp[idx][m]!=-1)
return dp[idx][m];
int ans=fun(idx+1,m);
ans+=fun(idx+1,(m*10+ar[idx])%n);
return dp[idx][m]=ans;
}
int main()
{
// input len , n , array
memset(dp,-1,sizeof(dp));
printf("%d\n",fun(0,0));
return 0;
}
Can someone please explain what is the logic behind the code - 'm*10+ar[idx])%n' ? Why is m multiplied by 10 here?

Say you have a sequence of 16 digits You could generate all 216 subsequences and test them, which is 65536 operations.
Or you could take the first 8 digits and generate the 28 possible subsequences, and sort them based on the result of their sum modulo 6, and do the same for the last 8 digits. This is only 512 operations.
Then you can generate all subsequences of the original 16 digit string that are divisible by 6 by taking each subsequence of the first list with a modulo value equal to 0 (including the empty subsquence) and concatenating it with each subsequence of the last list with a modulo value equal to 0.
Then take each subsequence of the first list with a modulo value equal to 1 and concatenate it with each subsequence of the last list with a modulo value equal to 5. Then 2 with 4, 3 with 3, 4 with 2 and 5 with 1.
So after an initial cost of 512 operations you can generate just those subsequences whose sum is divisible by 6. You can apply this algorithm recursively for larger sequences.

Create an array with a 6-bit bitmap for each position in the string. Work from right to left and set the array of bitmaps so that bitmaps have bits set in the array when there is some subsequence starting from just after the array which sums up to that position in the bitmap. You can do this from right to left using the bitmap just after the current position. If you see a 3 and the bitmap just after the current position is 010001 then sums 1 and 5 are already accessible by just skipping the 3. Using the 3 sums 4 and 2 are now available, so the new bitmap is 011011.
Now do a depth first search for subsequences from left to right, with the choice at each character being either to take that character or not. As you do this keep track of the mod 6 sum of the characters taken so far. Use the bitmaps to work out whether there is a subsequence to the right of that position that, added to the sum so far, yields zero. Carry on as long as you can see that the current sum leads to a subsequence of sum zero, otherwise stop and recurse.
The first stage has cost linear in the size of the input (for fixed values of 6). The second stage has cost linear in the number of subsequences produced. In fact, if you have to actually write out the subsequences visited (E.g. by maintaining an explicit stack and writing out the contents of the stack) THAT will be the most expensive part of the program.
The worst case is of course input 000000...0000 when all 2^n subsequences are valid.

I'm pretty sure a user named, amit, recently answered a similar question for combinations rather than subsequences where the divisor is 4, although I can't find it right now. His answer was to create, in this case, five arrays (call them Array_i) in O(n) where each array contains the array elements with a modular relationship i with 6. With subsequences we also need a way to record element order. For example, in your case of 2486, our arrays could be:
Array_0 = [null,null,null,6]
Array_1 = []
Array_2 = [null,4,null,null]
Array_3 = []
Array_4 = [2,null,8,null]
Array_5 = []
Now just cross-combine the appropriate arrays, maintaining element order: Array_0, Array_2 & Array_4, Array_0 & any other combination of arrays:
6, 24, 48, 246, 486

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

baseurl64 buffer decoding - node.js

Can someone explain this behavior? Buffer.from('5d9RAjZ2GCob-86_Ql', 'base64url').toString('base64url') // 5d9RAjZ2GCob-86_Qg Please take a close look at the last character l - g

Related

How does encode and decode 64 figure out that the last few zeros are mere padding?

what will be the dp and transitions in this problem

How to count binary sequence in binary number in Python?

Explain the number of bits in a hash value that features both numbers and letters

Subsequences whose sum of digits is divisible by 6

Categories

Resources