is there special characters or strings cannot be used as key in rapidjson? - rapidjson

can special characters be used as a part of key?
For example :
{
"+new":"addnew.png"
"":"empty.png"
}
Is this format of rapidjson valid?
Also, is there any special strings that is not valid to use as key?
(I think the earlier question cannot fully answer my question because it does not cover the case of empty string, e.g.:"":"empty.png")

Yes. Any valid string can be used as key.
But the shown JSON misses a comma between two members.

By http://www.json.org/:
A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes.
You can use JSONLint to validate the json string.

Related

Python interpret bytearray as bytes

I have a question regarding the interpretation of a string as bytes
Within python, I have the situation that one variable contains e.g. this value
"bytearray(b'\x13\x02US')"
this is unfortunately due to the behavior of a module I am using. My question is, how could i get this string into bytes?
I have tried stripping away the "bytearray(b'" and the "')" at the end, and use .encode() as a function, but the result then is:
b'\\x13\\x02US'
Which clearly escapes the \ in order to prevent the interpretation as bytes.
How could i get this converted into
b'\x13\x02US'
instead though?
Thank you very much!
You could use .decode().replace('\\', '\'), this way it replaces the double slashes with single ones. Either attache it after your .encode() function or do it on your string seperately.

Fullname multilingual Regexp

Currently the validation of fullname looks like:
/^[a-zA-Z ]{2,30}$/
But that regexp validates only latin alphabet names. This should be changed in order to handle multilingual characters also. I have tried:
/^(\p{L}\p{M}*){2,30}$/u
But it validates numbers within names also, which is not correct.
As in the first case, use a character class with Unicode as well:
/^[\p{L}\p{M}\p{Zs}]{2,30}$/u
The \p{Zs} denotes a space char, such as regular space and Japanese space char  .
In case you want to prevent space at the start and end, use these negative lookaheads:
/^(?!\p{Zs})(?!.*\p{Zs}$)[\p{L}\p{M}\p{Zs}]{2,30}$/u
See a demo on regex101.com.

JSONFormat.print() method encoding special characters and also adding extra slash

I need to convert a protobuf message to JSON string in java. For this I am using the below API as recommended by the docs (https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/util/JsonFormat.Printer.html)
String jsonString = JsonFormat.printer().includingDefaultValueFields().print(protobufMessage);
This is working fine for a simple string, however, when my string contains special characters like &, single quote etc. the gson.toJson() method inside JsonFormat is converting special characters to octal format. For example "A&BC" is converted to "A\u0026BC". Also, the resultant string has an extra backslash appended.
So finally "A&BC" is converted to the string "A\\u0026BC".
If it were "A\u0026BC" then I could have converted to a byte array and formed a string with it. But because of the additional backslash I am not able to do so.
Currently I am using protobuf version 3.7.1 and I tried to upgrade and check if any latest API is available, but it did not help. I searched online but did not find any references (a similar issue was reported for JSONFormat.printToString but this API is removed in a later version. https://github.com/carlomedas/protobuf-java-format/issues/16). Can someone please help here if you have come across this issue.
I think the problem might be that you're using that string to pass along, and it's getting parsed a 2nd time. If you use the printer, it will convert "A&BC" to "A\u0026BC". Then when Jackson parses that, it will append the 2nd backslash. To avoid this, you can use #JsonRawValue annotation to avoid being parsed with the 2nd backslash.

problem with caret and dollar sign regex in nodejs

in node js when i try to check for validation of incoming string using express-validator it doesn't match using
check('firstName').matches('^[a-zA-Z\s\'\-$]')
to parse firstName of incoming request body
Note I've edited the question to be like
check('firstName').matches('^[a-zA-Z\s\'\-]$')
I see two issues here:
The range issue because of \-. You should use double escaping character instead.
The given regex will only match the first character because the quantifier is missing. You should use the + (one or more characters) quantifier at the end of the regex for full match.
The correct regex for your case would be:
check('firstName').matches('^[a-zA-Z\s\'\\-$]+')
express was treating the string in different way than /regex/ this was the issue.

Azure Table Storage RowKey restricted Character Patterns?

Are there restricted character patterns within Azure TableStorage RowKeys? I've not been able to find any documented via numerous searches. However, I'm getting behavior that implies such in some performance testing.
I've got some odd behavior with RowKeys consisting on random characters (the test driver does prevent the restricted characters (/ \ # ?) plus blocking single quotes from occurring in the RowKey). The result is I've got a RowKey that will insert fine into the table, but cannot be queried (the result is InvalidInput). For example:
RowKey: 9}5O0J=5Z,4,D,{!IKPE,~M]%54+9G0ZQ&G34!G+
Attempting to query by this RowKwy (equality) will result in an error (both within our app, using Azure Storage Explorer, and Cloud Storage Studio 2). I took a look at the request being sent via Fiddler:
GET /foo()?$filter=RowKey%20eq%20'9%7D5O0J=5Z,4,D,%7B!IKPE,~M%5D%54+9G0ZQ&G34!G+' HTTP/1.1
It appears the %54 in the RowKey is not escaped in the filter. Interestingly, I get similar behavior for batch requests to table storage with URIs in the batch XML that include this RowKey. I've also seen similar behavior for RowKeys with embedded double quotes, though I have not isolated that pattern yet.
Has anyone co me across this behavior? I can easily restrict additional characters from occurring in RowKeys, but would really like to know the 'rules'.
The following characters are not allowed in PartitionKey and RowKey fields:
The forward slash (/) character
The backslash (\) character
The number sign (#) character
The question mark (?) character
Further Reading: Azure Docs > Understanding the Table service data model
public static readonly Regex DisallowedCharsInTableKeys = new Regex(#"[\\\\#%+/?\u0000-\u001F\u007F-\u009F]");
Detection of Invalid Table Partition and Row Keys:
bool invalidKey = DisallowedCharsInTableKeys.IsMatch(tableKey);
Sanitizing the Invalid Partition or Row Key:
string sanitizedKey = DisallowedCharsInTableKeys.Replace(tableKey, disallowedCharReplacement);
At this stage you may also want to prefix the sanitized key (Partition Key or Row Key) with the hash of the original key to avoid false collisions of different invalid keys having the same sanitized value.
Do not use the string.GetHashCode() though since it may produce different hash code for the same string and shall not be used to identify uniqueness and shall not be persisted.
I use SHA256: https://msdn.microsoft.com/en-us/library/s02tk69a(v=vs.110).aspx
to create the byte array hash of the invalid key, convert the byte array to hex string and prefix the sanitized table key with that.
Also see related MSDN Documentation:
https://msdn.microsoft.com/en-us/library/azure/dd179338.aspx
Related Section from the link:
Characters Disallowed in Key Fields
The following characters are not allowed in values for the PartitionKey and RowKey properties:
The forward slash (/) character
The backslash (\) character
The number sign (#) character
The question mark (?) character
Control characters from U+0000 to U+001F, including:
The horizontal tab (\t) character
The linefeed (\n) character
The carriage return (\r) character
Control characters from U+007F to U+009F
Note that in addition to the mentioned chars in the MSDN article, I also added the % char to the pattern since I saw in a few places where people mention it being problematic. I guess some of this also depends on the language and the tech you are using to access the table storage.
If you detect additional problematic chars in your case, then you can add those to the regex pattern, nothing else needs to change.
I just found out (the hard way) that the '+' sign is allowed, but not possible to query in PartitionKey.
I found that in addition to the characters listed in Igorek's answer, these also can cause problems (e.g. inserts will fail):
|
[]
{}
<>
$^&
Tested with the Azure Node.js SDK.
I transform the key using this function:
private static string EncodeKey(string key)
{
return HttpUtility.UrlEncode(key);
}
This needs to be done for the insert and for the retrieve of course.

Resources