I need to build a XMPP client. The server PLAIN mechanism to auth. (using zlib compression I think)
I captured trafic from other xmpp system thats use PLAIN mechanism and the text appear to be Base64 (id + token) ADc1Y2M2OWY0MzQwMTUwMjgyOWIwMWY2MDAyN2E0NDE2ADE1YTk0NzM3NTRiYjY2MGExMGYzYTA5MzA5NWQxMmY3 is what the client return. I put that into a Base64 decoder and its give me this : 75cc69f43401502829b01f60027a441615a9473754bb660a10f3a093095d12f7.
When I encode this using a Base64 encoder, Its give me something else than the first Base64 string (NzVjYzY5ZjQzNDAxNTAyODI5YjAxZjYwMDI3YTQ0MTYxNWE5NDczNzU0YmI2NjBhMTBmM2EwOTMwOTVkMTJmNw)
Can someone explain me? I couldn't find anything on google.
Edit:
https://xmpp.org/extensions/xep-0034.html#example-3
the result of your decoding is not correct, in fact the decoded value
contains two binary values that can't be displayed as a character
(here substituted by a �):
�75cc69f43401502829b01f60027a4416�15a9473754bb660a10f3a093095d12f7.
What you encoded then is based on a string in which the two binary
values are not present, so you encoded basically something different
and got of course a different result.
From jps
Related
I am using the GitHub API to download a file from GitHub. I have been able to successfully authenticate as well as get a response from github, and see a base64 encoded string representing the file contents.
Unfortunately, I get an unusual error (string length is not a multiple of 4) when decoding the base64 string.
The HTTP request is illustrated below:
GET /repos/:owner/:repo/contents/:path
The (partial) response is illustrated below:
{
"name":....,
"download_url":...",
"type":"file",
"content":"ewogICAgInN3YWdnZXIiOiAiM...
}
The issue I am encountering is that the length of the string is 15263 bytes, and I get an error in decoding the string (string length is not a multiple of 4). I am using node.js and the 'base64-js' npm module to decode the string. Code to execute the decoding is illustrated below:
var base64 = require('base64-js');
var contents = base64.toByteArray(fileContent);
The decoding causes an exception:
Error: Invalid string. Length must be a multiple of 4
at placeHoldersCount (.../node_modules/base64-js/index.js:23:11)
at Object.toByteArray (...node_modules/base64-js/index.js:42:18)
:
:
I would think that the GitHub API is sending me the correct data, so I figure that is not the issue.
Am I performing the decoding improperly or is there another problem I am overlooking?
Any help is appreciated.
I experimented a bit and found a solution by using a different base64 decoding library as follows:
var base64 = require('js-base64').Base64;
var contents = base64.decode(res.content);
I am not sure if it is mandatory to have an encoded string length divisible by 4 (clearly my 15263 character length string is not divisible by 4) but the alternate library decoded the string properly.
A second solution which I also found to work is specific to how to use the GitHub API. By adding the following to the GitHub API call header, I was also able to get the decoded file contents:
'accept': 'application/vnd.github.VERSION.raw'
After much experimenting, I think I nailed down the difference between the working and broken base64 decoding.
It appears GitHub Base-64 encodes with:
UTF-8 charset
Base 64 MIME encoder (RFC2045)
As opposed to a "basic" (RFC4648) Base64 encoder. Several languages seem to default to the basic encoder (including Java, which I was using). When I switched to a MIME encoder, I got the full contents of the file un-garbled. This would explain why switching libraries in some cases fixed the issue.
I will note the contents field contained newline characters - decoders are supposed to ignore them, but not all do, so if you still get errors, you may need to try removing them.
The media-type header will do the job better, however in my case I am trying to use the API via a GitHub App - at time of writing, GitHub requires a specific media type be used when doing that, and it returns the JSON response.
For some reason the Github APIs base64 encoded content doesn't decode properly at all the online base64 decoders I've tried from the front page of google.
Python works however:
import base64
base64.b64decode("ewogICAgInN3YWdnZXIiOiAiM...")
I want to check if the given string to my function is plain text or base64 format. I am able to encode a plain text to base64 format and reverse it.
But I could not figure out any way to validate if the string is already base64 format. Can anyone please suggest correct way of doing this in node js? Is there any API for doing this already available in node js.
Valid base64 strings are a subset of all plain-text strings. Assuming we have a character string, the question is whether it belongs to that subset. One way is what Basit Anwer suggests. Those libraries require installing libicu though. A more portable way is to use the built-in Buffer:
Buffer.from(str, 'base64')
Unfortunately, this decoding function will not complain about non-Base64 characters. It will just ignore non-base64 characters. So, it alone will not help. But you can try encoding it back to base64 and compare the result with the original string:
Buffer.from(str, 'base64').toString('base64') === str
This check will tell whether str is pure base64 or not.
Encoding is byte level.
If you're dealing in strings then all you can do is to guess or keep meta data information with your string to identify
But you can check these libraries out:
https://www.npmjs.com/package/detect-encoding
https://github.com/mooz/node-icu-charset-detector
Having searched around for a while now, I believe my problem may not be directly related to what others had. I am using unicode chars in forms (using angularjs for client-side) and noticed that the UTF8 strings didn't display on the server logs properly. Thus I decided to base64.encode all strings on the client side before submitting to the server (nodejs/express4). The JSON data arrives properly to the server, but when I try to convert it from base64 to UTF8 using a buffer I'm getting different symbols. I tested the strings on http://www.base64decode.org/ and they decode fine. Can anyone suggest what I might be doing wrong?
Example char: σ, base64="z4M=".
On the server this line decodes all JSON values to UTF8:
Object.keys(req.body).forEach(function(key) { req.body[key] = new Buffer(req.body[key], 'base64').toString('utf8'); });
And the "σ" char becomes "Ο" on the server. Anyone can assist?
Thus I decided to base64.encode all strings on the client side before submitting to the server (nodejs/express4).
No need to, really. Probably the thing you were doing wrong with utf-8 json is also wrong now.
Try to debug that.
noticed that the UTF8 strings didn't display on the server logs properly.
What do they display?
And on what OS are you?
Did you look at the logs with a hex viewer?
To me this looks like a typical "I have an a problem X, thought my solution half the way, but I am stuck with a sub-problem Y". Go back to X and attack it the right way (no base64).
Some has been using my Tornado application and making POST requests which contain this character: ¡
Tornado was unable to decode the value and ended up with this error: HTTP 400: Bad Request (Invalid unicode in PARAMNAME: b'DATAHERE')
So I made some investigation and learned that In request body, I was receiving %A1 for the corresponding character, which python's decode method had no difficulty to decode for utf-8 encoding.
But, after URL-decoding this value, Tornado ended up with \xa1 for the character and tried to decode this using utf-8 and failed, because this was actually ISO-8859-1 encoding.
So, what should be the appropriate way to fix this? Because user is sending valid output I don't want to loose this data.
The best answer is to make sure the client always sends utf8 instead of iso8859-1 (this used to require weird tricks like the rails snowman; I'm not sure about the current state of the art). If you cannot do that, override RequestHandler.decode_argument (http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.decode_argument), which can see the raw bytes and decide how to decode them (or pass them through unchanged if you don't want to decode at this point).
I'm adding some capabilities to an api to allow third parties to store user data. I know some users may already base64 encode their user ids before submitting them through the api, others might not.
I've done some checking on double encoding (encoding base64 of an already base64 encoded string), and it doesn't SEEM to be causing any problems.
From my understanding, it isn't possible to check if a string is base64 encoded.
Is there something here I should be looking out for down the line? is there another way I should be doing this, or is it safe?
Also, i'm cleaning the data like this.
$eid=preg_replace("/[^a-zA-Z0-9\/=+]/", "",base64_encode(#$_GET['eid']));
that should be safe to store in the database as it strips out any suspect characters after the string is encoded. But I'll need to return the non-encoded string to through the API.
So at some point I'll need to do echo base64_decode($eid); And it seems to me that this could be an opportunity for a hacker to run malicious code through my server.
Is that right?