How to deal with special characters in a string - string

I have a php script creating an encoded value, for example:
m>^æ–S[J¯vÖ_ÕÚuÍÔ'´äœÈ‘ ®#M©t²#÷[Éå¹UçfU5T°äÙ“©”ˆÇVÝ] [’e™a«Ã°7#dÉJ>
I then need to decode this in a vb.net application
The problem is that value above can have any characters. And VB.net can't handle it:
dim strCryptedString As String = 'm>^æ–S[J¯vÖ_ÕÚuÍÔ'´äœÈ‘ ®#M©t²#÷[Éå¹UçfU5T°äÙ“©”ˆÇVÝ] [’e™a«Ã°7#dÉJ>"
So any suggestions how to deal with that value?

Try base64encode and base64decode. That may be all that you need!

If you actually need to have it written out in your VB.net source code, you could try base64 encoding it:
dim strCryptedString As String = Base64Decode('bT5ew6bigJNTW0rCr3bDll/DlcOadcONw5QnwrTDpMWTw4jigJggwq5ATcKpdMKyI8O3W8OJw6XCuVXDp2ZVNVTCsMOkw5nigJzCqeKAncuGw4dWw51dIFvigJll4oSiYcKPwqvDg8KwNyNkw4lKPg==');
I'm not sure what the library functions' real names are.

When you read the string, read it into a byte array instead of a string. Then use the numeric value for the characters when you do the decoding.

Related

Converting a List of String to List of Double without losing information (VB.net)

I got a List of String. I am losing information (the dot) when I try to convert an entry to type Double. What am I doing wrong?
Dim list As New List(Of String)
Dim a As Double
list.Add("309.69686")
a = CDbl(list(0))
MsgBox(a)
'Output: 30969686
This happens because in your locale the separator for decimal numbers is probably not a point but something else (usually a comma).
You are using the old VB6 methods to convert this string to a double and this method (CDbl) has no way to use a different locale settings.
So in the most basic form you need to change that method to the native .NET methods
a = Double.Parse(list(0), CultureInfo.InvariantCulture)
Here we pass the information about what locale setting Parse should use in converting the input string to a double. And the InvariantCulture uses the point as separator.
Of course, you should consider that, if the input string is obtained from the user input, then you could face other problems (like invalid numeric strings). In this case you should not use double.Parse, but double.TryParse
If you have a German Windows, then the dot will be interpreted as thousands separator. You must specify the culture explicitly, if you need another behaviour.
Dim d = Double.Parse("309.69686", CultureInfo.InvariantCulture)

In Python 3, how can I convert ascii to string, *without encoding/decoding*

Python 3.6
I converted a string from utf8 to this:
b'\xe6\x88\x91\xe6\xb2\xa1\xe6\x9c\x89\xe7\x94\xb5#xn--ssdcsrs-2e1xt16k.com.au'
I now want that chunk of ascii back into string form, so there is no longer the little b for bytes at the beginning.
BUT I don't want it converted back to UTF8, I want that same sequence of characters that you ses above in my Python string.
How can I do so? All I can find are ways of converting bytes to string along with encoding or decoding.
The (wrong) answer is quite simple:
chr(asciiCode)
In your special case:
myString = ""
for char in b'\xe6\x88\x91\xe6\xb2\xa1\xe6\x9c\x89\xe7\x94\xb5#xn--ssdcsrs-2e1xt16k.com.au':
myString+=chr(char)
print(myString)
gives:
æ没æçµ#xn--ssdcsrs-2e1xt16k.com.au
Maybe you are also interested in the right answer? It will probably not please you, because it says you have ALWAYS to deal with encoding/decoding ... because myString is now both UTF-8 and ASCII at the same time (exactly as it already was before you have "converted" it to ASCII).
Notice that how myString shows up when you print it will depend on the implicit encoding/decoding used by print.
In other words ...
there is NO WAY to avoid encoding/decoding
but there is a way of doing it a not explicit way.
I suppose that reading my answer provided HERE: Converting UTF-8 (in literal) to Umlaute will help you much in understanding the whole encoding/decoding thing.
What you have there is not ASCII, as it contains for instance the byte \xe6, which is higher than 127. It's still UTF8.
The representation of the string (with the 'b' at the start, then a ', then a '\', ...), that is ASCII. You get it with repr(yourstring). But the contents of the string that you're printing is UTF8.
But I don't think you need to turn that back into an UTF8 string, but it may depend on the rest of your code.

How to check if a string is plaintext or base64 format in Node.js

I want to check if the given string to my function is plain text or base64 format. I am able to encode a plain text to base64 format and reverse it.
But I could not figure out any way to validate if the string is already base64 format. Can anyone please suggest correct way of doing this in node js? Is there any API for doing this already available in node js.
Valid base64 strings are a subset of all plain-text strings. Assuming we have a character string, the question is whether it belongs to that subset. One way is what Basit Anwer suggests. Those libraries require installing libicu though. A more portable way is to use the built-in Buffer:
Buffer.from(str, 'base64')
Unfortunately, this decoding function will not complain about non-Base64 characters. It will just ignore non-base64 characters. So, it alone will not help. But you can try encoding it back to base64 and compare the result with the original string:
Buffer.from(str, 'base64').toString('base64') === str
This check will tell whether str is pure base64 or not.
Encoding is byte level.
If you're dealing in strings then all you can do is to guess or keep meta data information with your string to identify
But you can check these libraries out:
https://www.npmjs.com/package/detect-encoding
https://github.com/mooz/node-icu-charset-detector

python 3: how to make strip() work for bytes

I've contered a Python 2 code to Python 3.
In doing so, I've changed
print 'String: ' + somestring
into
print(b'String: '+somestring)
because I was getting the following error:
Can't convert 'bytes' object to str implicitly
But then now I can't implement string attributes such as strip(), because they are no longer treated as strings...
global name 'strip' is not defined
for
if strip(somestring)=="":
How should I solve this dilemma between switching string to bytes and being able to use string attributes? Is there a workaround?
Please help me out and thank you in advance..
There are two issues here, one of which is the actual issue, the other is confusing you, but not an actual issue. Firstly:
Your string is a bytes object, ie a string of 8-bit bytes. Python 3 handles this differently from text, which is Unicode. Where do you get the string from? Since you want to treat it as text, you should probably convert it to a str-object, which is used to handle text. This is typically done with the .decode() function, ie:
somestring.decode('UTF-8')
Although calling str() also works:
str(somestring, 'UTF8')
(Note that your decoding might be something else than UTF8)
However, this is not your actual question. Your actual question is how to strip a bytes string. And the asnwer is that you do that the same way as you string a text-string:
somestring.strip()
There is no strip() builtin in either Python 2 or Python 3. There is a strip-function in the string module in Python 2:
from string import strip
But it hasn't been good practice to use that since strings got a strip() method, which is like ten years or so now. So in Python 3 it is gone.
>>> b'foo '.strip()
b'foo'
Works just fine.
If what you're dealing with is text, though, you probably should just have an actual str object, not a bytes object.
I believe you can use the "str" function to cast it to a string
print str(somestring).strip()
or maybe
print str(somestring, "utf-8").strip()
However, if the object is already a string, you don't get a new one. So if you're not sure whether an object is a string and need it to be and call str(obj), you won't create another if it's already a string.
x='123'
id(x)
2075707536496
y=str(x)
id(y)
2075707536496

How to convert a string to a byte array which is compiled with a given charset in Go?

In java, we can use the method of String : byte[] getBytes(Charset charset) .
This method Encodes a String into a sequence of bytes using the given charset, storing the result into a new byte array.
But how to do this in GO?
Is there any similar way in Go can do this?
Please let me know it.
The standard Go library only supports Unicode (UTF-8, UTF-16, UTF-32) and ASCII encoding. ASCII is a subset of UTF-8.
The go-charset package (found from here) supports conversion to and from UTF-8 and it also links to the GNU iconv library.
See also field CharsetReader in encoding/xml.Decoder.
I believe here is an answer: https://stackoverflow.com/a/6933412/1315563
There is no way to do it without writing the conversion yourself or
using a third-party package. You could try using this:
http://code.google.com/p/go-charset

Resources