Parsing a string in MATLAB - string

I have a 32 bit string that I want to parse into 8 bits each. Then I want to change the 8 bit binary into a single integer, for example:
str = '00000001000000100000001100000100'
output = '1 2 3 4'
I know to use bin2dec, but I'm having difficulty parsing the string.

In Matlab every string is a matrix, so you can use this property. If 8 bits belong to one byte, reshape your data to have one byte per row:
reshape(str,8,[]).'
Doing so, you can apply bin2dec to get the output:
output=bin2dec(reshape(str,8,[]).')
This returns a vetor [1;2;3;4], use num2str(output.') if you want a char array instead.

Another possibility:
>> 2.^(7:-1:0)*reshape(str-'0',8,[])
>> ans =
1 2 3 4
Of course, apply num2str if you need the output in the form of a string.
A more esoteric way:
>> fliplr(typecast(uint32(2.^(31:-1:0)*(str-'0').'),'uint8'))
>> ans =
1 2 3 4

Seeing as how Ben Voigt didn't provide an answer to this question even though he pretty much answered the question in the comments, I will provide one for closure. As he said, you can segment your string into 8 characters each. Strings are essentially an array of characters. As such, split up your string into 8 characters each, then apply bin2dec on each of the strings.
str = '00000001000000100000001100000100';
byte1 = bin2dec(str(1:8));
byte2 = bin2dec(str(9:16));
byte3 = bin2dec(str(17:24));
byte4 = bin2dec(str(25:32));
output = num2str([byte1 byte2 byte3 byte4]);
>> output
output =
1 2 3 4
Looking at your example output, you desire output to be a string, and thus the num2str call in the last line of the code.

Related

Does MATLAB provide a lossless coversion function from double to string?

tl;dr
I'm just looking for two functions, f from double to string and g from string to double, such that g(f(d)) == d for any double d (scalar and real double).
Original question
How do I convert a double to a string or char array in a reversible way? I mean, in such a way that afterward I can convert that string/char array back to double retrieving the original result.
I've found formattedDisplayText, and in some situations it works:
>> x = eps
x =
2.220446049250313e-16
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
0
But in others it doesn't
x = rand(1)
x =
0.546881519204984
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
1.110223024625157e-16
As regards this and other tools like num2str, mat2str, at the end they all require me to decide a precision, whereas I would like to express the idea of "use whatever precision is needed for you (MATLAB) to be able to read back your own number".
Here are two simpler solutions to convert a single double value to a string and back without loss.
I want the string to be a human-readable representation of the number
Use num2str to obtain 17 decimal digits in string form, and str2double to convert back:
>> s = mat2str(x,17)
s =
'2.2204460492503131e-16'
>> y = str2double(s);
>> y==x
ans =
logical
1
Note that 17 digits are always enough to represent any IEEE double-precision floating-point number.
I want a more compact string representation of the number
Use matlab.net.base64encode to encode the 8 bytes of the number. Unfortunately you can only encode strings and integer arrays, so we type cast to some integer array (we use uint8 here, but uint64 would work too). We reverse the process to get the same double value back:
>> s = matlab.net.base64encode(typecast(x,'uint8'))
s =
'AAAAAAAAsDw='
>> y = typecast(matlab.net.base64decode(s),'double');
>> x==y
ans =
logical
1
Base64 encodes every 3 bytes in 4 characters, this is the most compact representation you can easily create. A more complex algorithm could likely convert into a smaller UTF-8-encoded string (which uses more than 6 bytes per displayable character).
Function f: from double real-valued scalar x to char vector str
str = num2str(typecast(x, 'uint8'));
str is built as a string containing 8 numbers, which correspond to the bytes in the internal representation of x. The function typecast extracts the bytes as a numerical vector, and num2str converts to a char vector with numbers separated by spaces.
Function g: from char vector str to double real-valued scalar y
y = typecast(uint8(str2double(strsplit(str))), 'double');
The char vector is split at spaces using strsplit. The result is a cell array of char vectors, each of which is then interpreted as a number by str2double, which produces a numerical vector. The numbers are cast to uint8 and then typecast interprets them as the internal representation of a double real-valued scalar.
Note that str2double(strsplit(str)) is preferred over the simpler str2num(str), because str2num internally calls eval, which is considered evil bad practice.
Example
>> format long
>> x = sqrt(pi)
x =
1.772453850905516
>> str = num2str(typecast(x, 'uint8'))
str =
'106 239 180 145 248 91 252 63'
>> y = typecast(uint8(str2double(strsplit(str))), 'double')
y =
1.772453850905516
>> x==y
ans =
logical
1

Making one string the anagram of other

I have a problem where two strings of same length are given, and I have to tell how many letters I have to change in the first string to make it an anagram of the second.
Here is what I did:
count = 0
Mutable_str = ''.join(sorted("hhpddlnnsjfoyxpci"))
Ref_str = ''.join(sorted("ioigvjqzfbpllssuj"))
i = 0
while i < len(Mutable_str):
if Mutable_str[i] != Ref_str[i]:
count += 1
i += 1
print(count)
My algorithm in this case returned 16 as result. But the correct answer is 10. Can someone tell me what is wrong in my code?
Thank you very much!
You need to use str.count
So you need to add up the differences between the number of occurrences of each character in the different strings. This can be done with str.count(c) where c is each distinct character in the second string (got with set()). We then need to use max() on the difference with 0 so that if the difference is negative this doesn't effect the total differences.
So as you can see, it boils down to one neat little one-liner:
def changes(s1, s2):
return sum(max(0, s2.count(c) - s1.count(c)) for c in set(s2))
and some tests:
>>> changes("hhpddlnnsjfoyxpci", "ioigvjqzfbpllssuj")
10
>>> changes("abc", "bcd")
1
>>> changes("jimmy", "bobby")
4

Pattern Matching BASIC programming Language and Universe Database

I need to identify following patterns in string.
- "2N':'2N':'2N"
- "2N'-'2N'-'2N"
- "2N'/'2N'/'2N"
- "2N'/'2N'-'2N"
AND SO ON.....
basically i want this pattern if written in Simple language
2 NUMBERS [: / -] 2 NUMBERS [: / -] 2 NUMBERS
So is there anyway by which i could write one pattern which will cover all the possible scenarios ? or else i have to write total 9 patterns and had to match all 9 patterns to string.... and it is not the scenario in my code , i have to match 4, 2 number digits separated by [: / -] to string for which i have towrite total 27 patterns. So for understanding purpose i have taken 3 ,2 digit scenario...
Please help me...Thank you
Maybe you could try something like (Pick R83 style)
OK = X MATCH "2N1X2N1X2N" AND X[3,1]=X[6,1] AND INDEX(":/-",X[3,1],1) > 0
Where variable X is some input string like: 12-34-56
Should set variable OK to 1 if validation passes, else 0 for any invalid format.
This seems to get all your required validation into a single statement. I have assumed that the non-numeric characters have to be the same. If this is not true, the check could be changed to something like:
OK = X MATCH "2N1X2N1X2N" AND INDEX(":/-",X[3,1],1) > 0 AND INDEX(":/-",X[6,1],1) > 0
Ok, I guess the requirement of surrounding characters was not obvious to me. Still, it does not make it much harder. You just need to 'parse' the string looking for the first (I assume) such pattern (if any) in the input string. This can be done in a couple of lines of code. Here is a (rather untested ) R83 style test program:
PROMPT ":"
LOOP
LOOP
CRT 'Enter test string':
INPUT S
WHILE S # "" AND LEN(S) < 8 DO
CRT "Invalid input! Hit RETURN to exit, or enter a string with >= 8 chars!"
REPEAT
UNTIL S = "" DO
*
* Look for 1st occurrence of pattern in string..
CARDNUM = ""
FOR I = 1 TO LEN(S)-7 WHILE CARDNUM = ""
IF S[I,8] MATCH "2N1X2N1X2N" THEN
IF INDEX(":/-",S[I+2,1],1) > 0 AND INDEX(":/-",S[I+5,1],1) > 0 THEN
CARDNUM = S[I,8] ;* Found it!
END ELSE I = I + 8
END
NEXT I
*
CRT CARDNUM
REPEAT
There is only 7 or 8 lines here that actually look for the card number pattern in the source/test string.
Not quite perfect but how about 2N1X2N1X2N this gets you 2 number followed by 1 of any character followed by 2 numbers etc.
This might help:
BIG.STRING ="HELLO TILDE ~ CARD 12:34:56 IS IN THIS STRING"
TEMP.STRING = BIG.STRING
CONVERT "~:/-" TO "*~~~" IN TEMP.STRING
IF TEMP.STRING MATCHES '0X2N"~"2N"~"2N0X' THEN
FIRST.TILDE.POSN = INDEX(TEMP.STRING,"~",1)
CARD.STRING = BIG.STRING[FIRST.TILDE.POSN-2,8]
PRINT CARD.STRING
END

Converting int to string then back to int

How do I call out a particular digit from a number. For example: bringing out 6 from 768, then using 6 to multiply 3. I've tried using the code below, but it does not work.
digits = []
digits = str(input("no:"))
print (int(digits[1] * 5))
If my input is 234 since the value in[1] is 3, how can I multiply the 3 by 5?
input() returns a string (wether or not you explicitly convert it to str() again), so digits[1] is still a single character string.
You need to convert that single digit to an integer with int(), not the result of the multiplication:
print (int(digits[1]) * 5)
All I did was move a ) parenthesis there.
Your mistake was to multiply the single-character string; multiplying a string by n produces that string repeated n times.
digits[1] = '3' so digits[1] * 5 = '33333'. You want int(digits[1]) * 5.

how to get the number from a string in matlab

I want to how know to get certain numbers from a string in matlab. For example, I have a string:
'ABCD_01 36_00 3 .txt', (there is spacing between 01 and 36)
What I need is to get the number 36 and 3. How can I do it in matlab? I've tried finding the answer from previous posts but can not find one that fits this purpose. Thanks for the help.
Regular expressions:
>> str = 'ABCD_01 36_00 3 .txt';
>> t = str2double( regexp(str,'.* (\d+)_.* (\d+)','tokens','once') )
t =
36 3
If the filenames always start with four characters you can do:
>> filename = 'ABCD_01 36_00 3 .txt';
>> sscanf(filename, '%*4c_%*u %u_%*u %u.txt')
ans =
36
3

Resources