encode variables with special characters for azure - azure

Very Similar problem to AADSTS50012: Invalid client secret is provided when moving from a Test App to Production
The top answer says to Encode your secret e.g. replace + by %2B and = by %3D, etc how would I replace the special character Tilde ~

As Suggested by juunas, and as per the document yes, you can replace the special character.
URL encoding converts characters into a format that can be transmitted over the Internet.
Here is the link for complete information regarding Encoding Techniques.

Related

Cannot find a urldecode function in terraform

I see there is a urlencode in terraform but no urldecode. Is there any reason why it is not there? What would be the workaround to achieve that in terraform?
Thanks,
Ram
The functions built in to the Terraform language tend to (with some notable historical exceptions) focus on solving problems that seem to arise very commonly in typical definitions of infrastructure. URL encoding arises in such situations as building URLs for API calls, whereas URL decoding seems to come up less frequently in the typical scope where Terraform is used and so there haven't been sufficient real-world examples of the need to justify making it a built-in.
The Terraform language does include some features that allow for basic manual string tokenization and transforming, but a key missing piece for fully-general URL decoding is that Terraform does not have a function which can take a number and return the corresponding character as defined by a specific character lookup table, such as ASCII or Unicode.
If you know that in practice your inputs will only use URL escaping sequences for specific reserved symbols then you can approximate URL decoding with a lookup table combined with a tokenizing regular expression:
locals {
input = "foo%3Fx%3Dtest"
tokens = regexall("(?:%[0-9a-fA-F]{2}|[^%]+)", local.input)
replacements = tomap({
"%3f" = "?"
"%3d" = "="
"%25" = "%"
})
result = join("", [
for token in local.tokens : (
substr(token, 0, 1) == "%" ?
local.replacements[lower(token)] :
token
)
])
}
The above works for the limited encoding vocabulary of only ?, =, and % but will fail if there are any encoded characters other than those. You can of course expand this vocabulary to include any additional characters you'd like to include, and you could potentially expand that table to include all 128 ASCII characters if you like.
It would not be possible to decode non-ASCII (i.e. Unicode-only) characters with this simplistic strategy because URL encoding of those involves encoding first as UTF-8 and then encoding the individual bytes that result, and the Terraform language does not include any facilities for working with raw bytes: it works only with unicode strings.

Is there any reason why okhttp3.Credentials use ISO_8859_1 instead UTF-8?

I am using okhttp3.Credentials to get Base64 string in my current project. I spot an issue with cyrillic symbols I passed on server as Base64 string and eventually find out current implementation of okhttp3.Credentials uses ISO_8859_1.
Is there a thoughtful intent to go with ISO_8859_1 here instead of more universal UTF-8?
Update from answer:
From reference to spec
The original definition of this authentication scheme failed to
specify the character encoding scheme used to convert the user-pass
into an octet sequence. In practice, most implementations chose
either a locale-specific encoding such as ISO-8859-1 ([ISO-8859-1]),
or UTF-8 ([RFC3629]). For backwards compatibility reasons, this
specification continues to leave the default encoding undefined, as
long as it is compatible with US-ASCII (mapping any US-ASCII
character to a single octet matching the US-ASCII character code).
B.3. Why not simply switch the default encoding to UTF-8?
There are sites in use today that default to a local character
encoding scheme, such as ISO-8859-1 ([ISO-8859-1]), and expect user
agents to use that encoding. Authentication on these sites will stop
working if the user agent switches to a different encoding, such as
UTF-8.
Note that sites might even inspect the User-Agent header field
([RFC7231], Section 5.5.3) to decide which character encoding scheme
to expect from the client. Therefore, they might support UTF-8 for
some user agents, but default to something else for others. User
agents in the latter group will have to continue to do what they do
today until the majority of these servers have been upgraded to
always use UTF-8.
Discussion here https://github.com/square/okhttp/pull/3134
It's legacy and you can override with the optional param.

Azure CosmosDB: illegal characters in Document Id

I have the issue that the Ids that are being generated based on certain input contain the character "/". This leads to an error during the Upsert operation as "/" is not allowed in the Document id.
Which characters are not allowed beside that?
What are ways to handle such a situation?
The illegal characters are /, \\, ?, # (see https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.resource.id?view=azure-dotnet)
Ways to deal with such a situation:
Remove the character already in the input used for generating the id
Replace the character in the id with another character / set of characters
Use Hashing / Encoding (e.g. Base64)
If you know a better way please share. Thanks
I'm base64 encoding the plaintext. Then replacing the '/' and '=' that can still be there from base64 with '-' and '_' respectively. This seems to be working well. I ran into other issues when I just tried UrlEncode the value.
Psuedo:
var encoded = String.ConvertToBase64(plainTextId);
var preppedId = encoded.Replace('/','-').Replace('=','_');

Azure Table Storage RowKey restricted Character Patterns?

Are there restricted character patterns within Azure TableStorage RowKeys? I've not been able to find any documented via numerous searches. However, I'm getting behavior that implies such in some performance testing.
I've got some odd behavior with RowKeys consisting on random characters (the test driver does prevent the restricted characters (/ \ # ?) plus blocking single quotes from occurring in the RowKey). The result is I've got a RowKey that will insert fine into the table, but cannot be queried (the result is InvalidInput). For example:
RowKey: 9}5O0J=5Z,4,D,{!IKPE,~M]%54+9G0ZQ&G34!G+
Attempting to query by this RowKwy (equality) will result in an error (both within our app, using Azure Storage Explorer, and Cloud Storage Studio 2). I took a look at the request being sent via Fiddler:
GET /foo()?$filter=RowKey%20eq%20'9%7D5O0J=5Z,4,D,%7B!IKPE,~M%5D%54+9G0ZQ&G34!G+' HTTP/1.1
It appears the %54 in the RowKey is not escaped in the filter. Interestingly, I get similar behavior for batch requests to table storage with URIs in the batch XML that include this RowKey. I've also seen similar behavior for RowKeys with embedded double quotes, though I have not isolated that pattern yet.
Has anyone co me across this behavior? I can easily restrict additional characters from occurring in RowKeys, but would really like to know the 'rules'.
The following characters are not allowed in PartitionKey and RowKey fields:
The forward slash (/) character
The backslash (\) character
The number sign (#) character
The question mark (?) character
Further Reading: Azure Docs > Understanding the Table service data model
public static readonly Regex DisallowedCharsInTableKeys = new Regex(#"[\\\\#%+/?\u0000-\u001F\u007F-\u009F]");
Detection of Invalid Table Partition and Row Keys:
bool invalidKey = DisallowedCharsInTableKeys.IsMatch(tableKey);
Sanitizing the Invalid Partition or Row Key:
string sanitizedKey = DisallowedCharsInTableKeys.Replace(tableKey, disallowedCharReplacement);
At this stage you may also want to prefix the sanitized key (Partition Key or Row Key) with the hash of the original key to avoid false collisions of different invalid keys having the same sanitized value.
Do not use the string.GetHashCode() though since it may produce different hash code for the same string and shall not be used to identify uniqueness and shall not be persisted.
I use SHA256: https://msdn.microsoft.com/en-us/library/s02tk69a(v=vs.110).aspx
to create the byte array hash of the invalid key, convert the byte array to hex string and prefix the sanitized table key with that.
Also see related MSDN Documentation:
https://msdn.microsoft.com/en-us/library/azure/dd179338.aspx
Related Section from the link:
Characters Disallowed in Key Fields
The following characters are not allowed in values for the PartitionKey and RowKey properties:
The forward slash (/) character
The backslash (\) character
The number sign (#) character
The question mark (?) character
Control characters from U+0000 to U+001F, including:
The horizontal tab (\t) character
The linefeed (\n) character
The carriage return (\r) character
Control characters from U+007F to U+009F
Note that in addition to the mentioned chars in the MSDN article, I also added the % char to the pattern since I saw in a few places where people mention it being problematic. I guess some of this also depends on the language and the tech you are using to access the table storage.
If you detect additional problematic chars in your case, then you can add those to the regex pattern, nothing else needs to change.
I just found out (the hard way) that the '+' sign is allowed, but not possible to query in PartitionKey.
I found that in addition to the characters listed in Igorek's answer, these also can cause problems (e.g. inserts will fail):
|
[]
{}
<>
$^&
Tested with the Azure Node.js SDK.
I transform the key using this function:
private static string EncodeKey(string key)
{
return HttpUtility.UrlEncode(key);
}
This needs to be done for the insert and for the retrieve of course.

What is the smallest URL friendly encoding?

I am looking for a url encoding method that is most efficient in terms of space. Raw binary (base2) could be represented in base16 which is smaller and is url safe, but base64 is even more efficient. However, the usual base64 encoding isn't url safe....
So what is the smallest encoding method that is also safe for URLS?
This is what the Base64 URL encoding variant is for.
It uses the same standard Base64 Alphabet except that + is changed to - and / is changed to _.
Most modern Base64 implementations will support this alternate encoding. If yours doesn't, it's usually just a matter of doing a search/replace on the Base64 input prior to decoding, or on the output prior to sending it to a browser.
You can use a 62 character representation instead of the usual base 64. This will give you URLs like the youtube ones:
http://www.youtube.com/watch?v=0JD55e5h5JM
You can use the PHP functions provided in this page if you need to map strings to a database numerical ID:
http://bsd-noobz.com/blog/how-to-create-url-shortening-service-using-simple-php
Or this one if you need to directly convert a numerical ID to a short URL string:
http://kevin.vanzonneveld.net/techblog/article/create_short_ids_with_php_like_youtube_or_tinyurl/
"base66" (theoretical, according to spec)
As far as I can tell, the optimal encoding for URLs is a "base66" encoding into the following alphabet:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789-_.~
These are all the "Unreserved characters" according the URI specification RFC 3986 (section 2.3), so they will appear as-is in the URL. Using this "base66" encoding could give a URL like:
https://example.org/articles/.3Ja~jkWe
The question is then if you want . and ~ in your URLs?
On some older servers (ancient by now, I guess) ~joe would mean the "www directory" of the user joe on this server. And thus a user might be confused as to what the ~ character is doing in the middle of your URL.
This is common for academic websites, especially CS professors (e.g. Donald Knuth's website https://www-cs-faculty.stanford.edu/~knuth/)
"base80" (in practice, but not battle-tested)
However, in my own testing the following 14 other symbols also do not get
percent-encoded (in Chrome 95 and Firefox 93):
!$'()*+,:;=#[]
(see also this StackOverflow answer)
leaving a "base80" URL encoding possible. Some of these (notably + and =) would not work in the query string part of the URL, only in the path part. All in all, this ends up giving you beautiful, hyper-compressed URLs like:
https://example.org/articles/1OWG,HmpkySCbBy#RG6_,
https://example.org/articles/21Cq-b6Ud)txMEW$,hc4K
https://example.org/articles/:3Tx**U9X'd;tl~rR]q+
There's a plethora of reasons why you might not want all of those symbols in your URLs. One example is that StackOverflow's own "linkifier" won't include that ending comma in the link it generates (I've manually made it a part of the link here).
Also the percent encoding seems to be quite finicky. In some cases Firefox would initially percent-encode ' and ~] but on later requests would not.

Resources