Does MATLAB provide a lossless coversion function from double to string? - string

tl;dr
I'm just looking for two functions, f from double to string and g from string to double, such that g(f(d)) == d for any double d (scalar and real double).
Original question
How do I convert a double to a string or char array in a reversible way? I mean, in such a way that afterward I can convert that string/char array back to double retrieving the original result.
I've found formattedDisplayText, and in some situations it works:
>> x = eps
x =
2.220446049250313e-16
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
0
But in others it doesn't
x = rand(1)
x =
0.546881519204984
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
1.110223024625157e-16
As regards this and other tools like num2str, mat2str, at the end they all require me to decide a precision, whereas I would like to express the idea of "use whatever precision is needed for you (MATLAB) to be able to read back your own number".

Here are two simpler solutions to convert a single double value to a string and back without loss.
I want the string to be a human-readable representation of the number
Use num2str to obtain 17 decimal digits in string form, and str2double to convert back:
>> s = mat2str(x,17)
s =
'2.2204460492503131e-16'
>> y = str2double(s);
>> y==x
ans =
logical
1
Note that 17 digits are always enough to represent any IEEE double-precision floating-point number.
I want a more compact string representation of the number
Use matlab.net.base64encode to encode the 8 bytes of the number. Unfortunately you can only encode strings and integer arrays, so we type cast to some integer array (we use uint8 here, but uint64 would work too). We reverse the process to get the same double value back:
>> s = matlab.net.base64encode(typecast(x,'uint8'))
s =
'AAAAAAAAsDw='
>> y = typecast(matlab.net.base64decode(s),'double');
>> x==y
ans =
logical
1
Base64 encodes every 3 bytes in 4 characters, this is the most compact representation you can easily create. A more complex algorithm could likely convert into a smaller UTF-8-encoded string (which uses more than 6 bytes per displayable character).

Function f: from double real-valued scalar x to char vector str
str = num2str(typecast(x, 'uint8'));
str is built as a string containing 8 numbers, which correspond to the bytes in the internal representation of x. The function typecast extracts the bytes as a numerical vector, and num2str converts to a char vector with numbers separated by spaces.
Function g: from char vector str to double real-valued scalar y
y = typecast(uint8(str2double(strsplit(str))), 'double');
The char vector is split at spaces using strsplit. The result is a cell array of char vectors, each of which is then interpreted as a number by str2double, which produces a numerical vector. The numbers are cast to uint8 and then typecast interprets them as the internal representation of a double real-valued scalar.
Note that str2double(strsplit(str)) is preferred over the simpler str2num(str), because str2num internally calls eval, which is considered evil bad practice.
Example
>> format long
>> x = sqrt(pi)
x =
1.772453850905516
>> str = num2str(typecast(x, 'uint8'))
str =
'106 239 180 145 248 91 252 63'
>> y = typecast(uint8(str2double(strsplit(str))), 'double')
y =
1.772453850905516
>> x==y
ans =
logical
1

Related

Understanding Python sequence

I am doing a hackerrank example called Flipping bits where given a list of 32 bit unsigned integers. Flip all the bits (1->0 and 0->1) and return the result as an unsigned integer.
The correct code is:
def flippingBits(n):
seq = format(n, '032b')
return int(''.join(['0' if bit == '1' else '1' for bit in seq]), 2)
I dont understand the last line, what does the ''. part do? and why is there a ,2 at the end?
I have understood most of the code but need help in understanding the last part.
what does the ''. part do
'' represents an empty string which will be used as separator to join collection elements into string (some examples can be found here)
and why is there a ,2 at the end?
from int docs:
class int(x=0)
class int(x, base=10)
Return an integer object constructed from a number or string x
In this case it will parse the string provided in binary format (i.e. with base 2) into int.
I hope the below explanation helps:
def flippingBits(n):
seq = format(n, '032b') # change the format from base 10 to base 2 with 32bit size unsigned integer into string
return int(''.join(['0' if bit == '1' else '1' for bit in seq]), 2)
# ['0' if bit == '1' else '1' for bit in seq] => means: build a list of characters from "seq" string
# in which whenever there is 1 convert it to 0, and to 1 otherwise; then
# ''.join(char_list) => means: build string by joining characters in char_list
# without space between them ('' means empty delimiter); then
# int(num_string, 2) => convert num_string from string to integer in a base 2
Notice that you can do the bit flipping by using bit-wise operations without converting to string back and forth.
def flippingBits(n):
inverted_n = ~n # flip all bits from 0 to 1, and 1 to 0
return inverted_n+2**32 # because the number is a signed integer, the most significant bit should be flipped as well

Converting int to string then back to int

How do I call out a particular digit from a number. For example: bringing out 6 from 768, then using 6 to multiply 3. I've tried using the code below, but it does not work.
digits = []
digits = str(input("no:"))
print (int(digits[1] * 5))
If my input is 234 since the value in[1] is 3, how can I multiply the 3 by 5?
input() returns a string (wether or not you explicitly convert it to str() again), so digits[1] is still a single character string.
You need to convert that single digit to an integer with int(), not the result of the multiplication:
print (int(digits[1]) * 5)
All I did was move a ) parenthesis there.
Your mistake was to multiply the single-character string; multiplying a string by n produces that string repeated n times.
digits[1] = '3' so digits[1] * 5 = '33333'. You want int(digits[1]) * 5.

Matlab matrix string manipulation

When I have a matrix, which has values written like 5.34000E+5. When I try to create a string variable, with the following value mat(1,1), which contains the 5.340000E+5, Matlab creates a string variable with 534000. How can I create a string variable like 5.34000E+5?
Thanks
You need to specify the formatting while converting:
>> number = 534000
number = 534000
>> s = num2str(number,'%10.5e\n')
s =
5.34000e+05
>> class(s)
ans = char
You can use sprintf
num = 534000;
str = sprintf('%.0f',num);
str2 = sprintf('%e',num);
disp(str);
disp(str2);
Here, % means that you want to specify format, f means float and .0 means that you want no decimals e means that you want it as exponential. For more info on this see sprintf format specifiers.

How to compute word scores in Scrabble using MATLAB

I have a homework program I have run into a problem with. We basically have to take a word (such as MATLAB) and have the function give us the correct score value for it using the rules of Scrabble. There are other things involved such as double word and double point values, but what I'm struggling with is converting to ASCII. I need to get my string into ASCII form and then sum up those values. We only know the bare basics of strings and our teacher is pretty useless. I've tried converting the string into numbers, but that's not exactly working out. Any suggestions?
function[score] = scrabble(word, letterPoints)
doubleword = '#';
doubleletter = '!';
doublew = [findstr(word, doubleword)]
trouble = [findstr(word, doubleletter)]
word = char(word)
gameplay = word;
ASCII = double(gameplay)
score = lower(sum(ASCII));
Building on Francis's post, what I would recommend you do is create a lookup array. You can certainly convert each character into its ASCII equivalent, but then what I would do is have an array where the input is the ASCII code of the character you want (with a bit of modification), and the output will be the point value of the character. Once you find this, you can sum over the points to get your final point score.
I'm going to leave out double points, double letters, blank tiles and that whole gamut of fun stuff in Scrabble for now in order to get what you want working. By consulting Wikipedia, this is the point distribution for each letter encountered in Scrabble.
1 point: A, E, I, O, N, R, T, L, S, U
2 points: D, G
3 points: B, C, M, P
4 points: F, H, V, W, Y
5 points: K
8 points: J, X
10 points: Q, Z
What we're going to do is convert your word into lower case to ensure consistency. Now, if you take a look at the letter a, this corresponds to ASCII code 97. You can verify that by using the double function we talked about earlier:
>> double('a')
97
As there are 26 letters in the alphabet, this means that going from a to z should go from 97 to 122. Because MATLAB starts indexing arrays at 1, what we can do is subtract each of our characters by 96 so that we'll be able to figure out the numerical position of these characters from 1 to 26.
Let's start by building our lookup table. First, I'm going to define a whole bunch of strings. Each string denotes the letters that are associated with each point in Scrabble:
string1point = 'aeionrtlsu';
string2point = 'dg';
string3point = 'bcmp';
string4point = 'fhvwy';
string5point = 'k';
string8point = 'jx';
string10point = 'qz';
Now, we can use each of the strings, convert to double, subtract by 96 then assign each of the corresponding locations to the points for each letter. Let's create our lookup table like so:
lookup = zeros(1,26);
lookup(double(string1point) - 96) = 1;
lookup(double(string2point) - 96) = 2;
lookup(double(string3point) - 96) = 3;
lookup(double(string4point) - 96) = 4;
lookup(double(string5point) - 96) = 5;
lookup(double(string8point) - 96) = 8;
lookup(double(string10point) - 96) = 10;
I first create an array of length 26 through the zeros function. I then figure out where each letter goes and assign to each letter their point values.
Now, the last thing you need to do is take a string, take the lower case to be sure, then convert each character into its ASCII equivalent, subtract by 96, then sum up the values. If we are given... say... MATLAB:
stringToConvert = 'MATLAB';
stringToConvert = lower(stringToConvert);
ASCII = double(stringToConvert) - 96;
value = sum(lookup(ASCII));
Lo and behold... we get:
value =
10
The last line of the above code is crucial. Basically, ASCII will contain a bunch of indexing locations where each number corresponds to the numerical position of where the letter occurs in the alphabet. We use these positions to look up what point / score each letter gives us, and we sum over all of these values.
Part #2
The next part where double point values and double words come to play can be found in my other StackOverflow post here:
Calculate Scrabble word scores for double letters and double words MATLAB
Convert from string to ASCII:
>> myString = 'hello, world';
>> ASCII = double(myString)
ASCII =
104 101 108 108 111 44 32 119 111 114 108 100
Sum up the values:
>> total = sum(ASCII)
total =
1160
The MATLAB help for char() says (emphasis added):
S = char(X) converts array X of nonnegative integer codes into a character array. Valid codes range from 0 to 65535, where codes 0 through 127 correspond to 7-bit ASCII characters. The characters that MATLABĀ® can process (other than 7-bit ASCII characters) depend upon your current locale setting. To convert characters into a numeric array, use the double function.
ASCII chart here.

Explain the use of "str - '0'" when converting a string to an integer.

I have noticed that a really cool method to convert a string, say
str = '1234'
to a vector is to use this trick.
vec = str - '0'
= [1 2 3 4]
My question is why does this method work?
Further, something like:
vec1 = str -'1'
= [0 1 2 3]
but
vec2 = str - '10'
Error using -
Matrix dimensions must agree.
What is taking place here?
When you use arithmetic operators with strings, Matlab casts the strings as doubles, which converts a string to ascii values:
>> double('1')
ans =
49
Thus, subtraction will work just fine, though addition will give weird results
>> '1'+'1'
ans =
98
Converting an array of strings to double results in an array of doubles, therefore the "matrix dimensions must agree":
>> double('10')
ans =
49 48
Thus, while subtracting '0' is thus a cool shortcut, I suggest you use STR2DOUBLE instead to avoid confusion.

Resources