Converting string to num with a specific format in Matlab - string

>> str = '0009.51998'
>> str2num(str)
or
>> sscanf(str,'%f')
ans = 9.5200
I want to get this instead:
ans = 9.51998

You are getting that. It's just being rounded off to four decimal places when it's displayed. Do format long to see more precision.
>> str = '0009.51998';
>> x = sscanf(str, '%f')
x =
9.5200
>> format long
>> x
x =
9.519980000000000
>>
You can also use str2double as an alternative to sscanf. It's safer and more flexible than str2num. That is because str2num uses the eval command. For example, try the following:
str2num(' figure();imshow(''peppers.png'')')
You might be surprised at the results.

Related

Does MATLAB provide a lossless coversion function from double to string?

tl;dr
I'm just looking for two functions, f from double to string and g from string to double, such that g(f(d)) == d for any double d (scalar and real double).
Original question
How do I convert a double to a string or char array in a reversible way? I mean, in such a way that afterward I can convert that string/char array back to double retrieving the original result.
I've found formattedDisplayText, and in some situations it works:
>> x = eps
x =
2.220446049250313e-16
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
0
But in others it doesn't
x = rand(1)
x =
0.546881519204984
>> double(formattedDisplayText(x, 'NumericFormat', 'long')) - x
ans =
1.110223024625157e-16
As regards this and other tools like num2str, mat2str, at the end they all require me to decide a precision, whereas I would like to express the idea of "use whatever precision is needed for you (MATLAB) to be able to read back your own number".
Here are two simpler solutions to convert a single double value to a string and back without loss.
I want the string to be a human-readable representation of the number
Use num2str to obtain 17 decimal digits in string form, and str2double to convert back:
>> s = mat2str(x,17)
s =
'2.2204460492503131e-16'
>> y = str2double(s);
>> y==x
ans =
logical
1
Note that 17 digits are always enough to represent any IEEE double-precision floating-point number.
I want a more compact string representation of the number
Use matlab.net.base64encode to encode the 8 bytes of the number. Unfortunately you can only encode strings and integer arrays, so we type cast to some integer array (we use uint8 here, but uint64 would work too). We reverse the process to get the same double value back:
>> s = matlab.net.base64encode(typecast(x,'uint8'))
s =
'AAAAAAAAsDw='
>> y = typecast(matlab.net.base64decode(s),'double');
>> x==y
ans =
logical
1
Base64 encodes every 3 bytes in 4 characters, this is the most compact representation you can easily create. A more complex algorithm could likely convert into a smaller UTF-8-encoded string (which uses more than 6 bytes per displayable character).
Function f: from double real-valued scalar x to char vector str
str = num2str(typecast(x, 'uint8'));
str is built as a string containing 8 numbers, which correspond to the bytes in the internal representation of x. The function typecast extracts the bytes as a numerical vector, and num2str converts to a char vector with numbers separated by spaces.
Function g: from char vector str to double real-valued scalar y
y = typecast(uint8(str2double(strsplit(str))), 'double');
The char vector is split at spaces using strsplit. The result is a cell array of char vectors, each of which is then interpreted as a number by str2double, which produces a numerical vector. The numbers are cast to uint8 and then typecast interprets them as the internal representation of a double real-valued scalar.
Note that str2double(strsplit(str)) is preferred over the simpler str2num(str), because str2num internally calls eval, which is considered evil bad practice.
Example
>> format long
>> x = sqrt(pi)
x =
1.772453850905516
>> str = num2str(typecast(x, 'uint8'))
str =
'106 239 180 145 248 91 252 63'
>> y = typecast(uint8(str2double(strsplit(str))), 'double')
y =
1.772453850905516
>> x==y
ans =
logical
1

Formating %e output of sprintf

Is it possible to format the output of sprintf, like following or should I use another function.
Say I have an variable dt= 9.765625e-05 and I want use sprintf to make a string for use when saving say a figure
fig = figure(nfig);
plot(x,y);
figStr = sprintf('NS2d_dt%e',dt);
saveas(fig,figStr,'pdf')
The punctuation mark dot presents me with problems, some systems mistake the format of the file.
using
figStr = sprintf('NS2d_dt%.2e',dt);
then
figStr = NS2d_dt9.77e-05
using
figStr = sprintf('NS2d_dt%.e',dt);
then
figStr = NS2d_dt1e-04
which is not precise enough. I would like something like this
using
figStr = sprintf('NS2d_dt%{??}e',dt);
then
figStr = NS2d_dt9765e-08
Essentially the only way to get your desired output is with some manipulation of the value or strings. So here's two solutions for you first with some string manipulation and second by manipulating the value. Hopefully, these 2 approaches will help reason out solutions for other problems, particularly the number manipulation.
String Manipulation
Solution
fmt = #(x) sprintf('%d%.0fe%03d', (sscanf(sprintf('%.4e', x), '%d.%de%d').' .* [1 0.1 1]) - [0 0.5 3]);
Explanation
First I use sprintf to print the number in a defined format
>> sprintf('%.4e', dt)
ans =
9.7656e-05
then sscanf to read it back in making sure to remove the . and e
>> sscanf(sprintf('%.4e', dt), '%d.%de%d').'
ans =
9 7656 -5
before printing it back we perform some manipulation of the data to get the correct values for printing
>> (sscanf(sprintf('%.4e', dt), '%d.%de%d').' .* [1 0.1 1]) - [0 0.5 3]
ans =
9 765.1 -8
and now we print
>> sprintf('%d%.0fe%03d', (sscanf(sprintf('%.4e', dt), '%d.%de%d').' .* [1 0.1 1]) - [0 0.5 3])
ans =
9765e-08
Number Manipulation
Solution
orderof = #(x) floor(log10(abs(x)));
fmt = #(x) sprintf('%.0fe%03d', x*(10^(abs(orderof(x))+3))-0.5, orderof(x)-3);
Explanation
First I create an anonymous orderof function which tells me the order (the number after e) of the input value. So
>> dt = 9.765625e-05;
>> orderof(dt)
ans =
-5
Next we manipulate the number to convert it to a 4 digit integer, this is the effect of adding 3 in
>> floor(dt*(10^(abs(orderof(dt))+3)))
ans =
9756
finally before printing the value we need to figure out the new exponent with
>> orderof(x)-3
ans =
-8
and printing will give us
>> sprintf('%.0fe%03d', floor(dt*(10^(abs(orderof(dt))+3))), orderof(dt)-3)
ans =
9765e-08
Reading your question,
The punctuation mark dot presents me with problems, some systems mistake the format of the file.
it seems to me that your actual problem is that when you build the file name using, for example
figStr = sprintf('NS2d_dt%.2e',dt);
you get
figStr = NS2d_dt9.77e-05
and, then, when you use that string as filename, the . is intepreted as the extension and the .pdf is not attached, so in Explorer you can not open the file double-clicking on it.
Considering that changing the representation of the number dt from 9.765e-05 to 9765e-08 seems quite wierd, you can try the following approach:
use the print function to save your figure in .pdf
add .pdf in the format specifier
This should allows you the either have the right file extension and the right format for the dt value.
peaks
figStr = sprintf('NS2d_dt_%.2e.pdf',dt);
print(gcf,'-dpdf', figStr )
Hope this helps.
figStr = sprintf('NS2d_dt%1.4e',dt)
figStr =
NS2d_dt9.7656e-05
specify the number (1.4 here) as NumbersBeforeDecimal (dot) NumbersAfterDecimal.
Regarding your request:
A = num2str(dt); %// convert to string
B = A([1 3 4 5]); %// extract first four digits
C = A(end-2:end); %// extract power
fspec = 'NS2d_dt%de%d'; %// format spec
sprintf(fspec ,str2num(B),str2num(C)-3)
NS2d_dt9765e-8

Matlab matrix string manipulation

When I have a matrix, which has values written like 5.34000E+5. When I try to create a string variable, with the following value mat(1,1), which contains the 5.340000E+5, Matlab creates a string variable with 534000. How can I create a string variable like 5.34000E+5?
Thanks
You need to specify the formatting while converting:
>> number = 534000
number = 534000
>> s = num2str(number,'%10.5e\n')
s =
5.34000e+05
>> class(s)
ans = char
You can use sprintf
num = 534000;
str = sprintf('%.0f',num);
str2 = sprintf('%e',num);
disp(str);
disp(str2);
Here, % means that you want to specify format, f means float and .0 means that you want no decimals e means that you want it as exponential. For more info on this see sprintf format specifiers.

Matlab: Convert cell string (comma separated) to vector

I have a huge csv file (as in: more than a few gigs) and would like to read it in Matlab and process each file. Reading the file in its entirety is impossible so I use this code to read in each line:
fileName = 'input.txt';
inputfile = fopen(fileName);
while 1
tline = fgetl(inputfile);
if ~ischar(tline)
break
end
end
fclose(inputfile);
This yiels a cell array of size(1,1) with the line as string. What I would like is to convert this cell to a normal array with just the numbers.
For example:
input.csv:
0.0,0.0,3.201,0.192
2.0,3.56,0.0,1.192
0.223,0.13,3.201,4.018
End result in Matlab for the first line:
A = [0.0,0.0,3.201,0.192]
I tried converting tline with double(tline) but this yields completely different results. Also tried using a regex but got stuck there. I got to the point where I split up all values into a different cell in one array. But converting to double with str2double yields only NaNs...
Any tips? Preferably without any loops since it already takes a while to read the entire file.
You are looking for str2num
>> A = '0.0,0.0,3.201,0.192';
>> str2num(A)
ans =
0 0 3.2010 0.1920
>> A = '0.0 0.0 3.201 0.192';
>> str2num(A)
ans =
0 0 3.2010 0.1920
>> A = '0.0 0.0 , 3.201 , 0.192';
>> str2num(A)
ans =
0 0 3.2010 0.1920
e.g., it's quite agnostic to input format.
However, I will not advise this for your use case. For your problem, I'd do
C = dlmread('input.txt',',', [1 1 1 inf]) % for first line
C = dlmread('input.txt',',') % for entire file
or
[a,b,c,d] = textread('input.txt','%f,%f,%f,%f',1) % for first line
[a,b,c,d] = textread('input.txt','%f,%f,%f,%f') % for entire file
if you want all columns in separate variables:
a = 0
b = 0
c = 3.201
d = 0.192
or
fid = fopen('input.txt','r');
C = textscan(fid, '%f %f %f %f', 1); % for first line only
C = textscan(fid, '%f %f %f %f', N); % for first N lines
C = textscan(fid, '%f %f %f %f', 1, 'headerlines', N-1); % for Nth line only
fclose(fid);
all of which are much more easily expandable (things like this, whatever they are, tend to grow bigger over time :). Especially dlmread is much less prone to errors than writing your own clauses is, for empty lines, missing values and other great nuisances very common in most data sets.
Try
data = dlmread('input.txt',',')
It will do exactly what you want to do.
If you still want to convert string to a vector:
line_data = sscanf(line,'%g,',inf)
This code will read the entire coma-separated string and convert each number.

MATLAB generate combination from a string

I've a string like this "FBECGHD" and i need to use MATLAB and generate all the required possible permutations? In there a specific MATLAB function that does this task or should I define a custom MATLAB function that perform this task?
Use the perms function. A string in matlab is a list of characters, so it will permute them:
A = 'FBECGHD';
perms(A)
You can also store the output (e.g. P = perms(A)), and, if A is an N-character string, P is a N!-by-N array, where each row corresponds to a permutation.
If you are interested in unique permutations, you can use:
unique(perms(A), 'rows')
to remove duplicates (otherwise something like 'ABB' would give 6 results, instead of the 3 that you might expect).
As Richante answered, P = perms(A) is very handy for this. You may also notice that P is of type char and it's not convenient to subset/select individual permutation. Below worked for me:
str = 'FBECGHD';
A = perms(str);
B = cellstr(reshape(A,7,[])');
C = unique(B);
It also appears that unique(A, 'rows') is not removing duplicate values:
>> A=[11, 11];
>> unique(A, 'rows')
ans =
11 11
However, unique(A) would:
>> unique(A)
ans =
11
I am not a matlab pro by any means and I didn't investigate this exhaustively but at least in some cases it appears that reshape is not what you want. Notice that below gives 999 and 191 as permutations of 199 which isn't true. The reshape function as written appears to operate "column-wise" on A:
>> str = '199';
A = perms(str);
B = cellstr(reshape(A,3,[])');
C = unique(B);
>> C
C =
'191'
'199'
'911'
'919'
'999'
Below does not produce 999 or 191:
B = {};
index = 1;
while true
try
substring = A(index,:);
B{index}=substring;
index = index + 1;
catch
break
end
end
C = unique(B)
C =
'199' '919' '991'

Resources