I'm reading fixed-width (9 characters) data from a text file using textscan. Textscan fails at a certain line containing the string:
' 9574865.0E+10 '
I would like to read two numbers from this:
957486 5.0E+10
The problem can be replicated like this:
dat = textscan(' 9574865.0E+10 ','%9f %9f','Delimiter','','CollectOutput',true,'ReturnOnError',false);
The following error is returned:
Error using textscan
Mismatch between file and format string.
Trouble reading floating point number from file (row 1u, field 2u) ==> E+10
Surprisingly, if we add a minus, we don't get an error, but a wrong result:
dat = textscan(' -9574865.0E+10 ','%9f %9f','Delimiter','','CollectOutput',true,'ReturnOnError',false);
Now dat{1} is:
-9574865 0
Obviously, I need both cases to work. My current workaround is to add commas between the fields and use commas as a delimiter in textscan, but that's slow and not a nice solution. Is there any way I can read this string correctly using textscan or another built-in (for performance reasons) MATLAB function?
I suspect textscan first trims leading white space, and then parses the format string. I think this, because if you change yuor format string from
'%9f%9f'
to
'%6f%9f'
your one-liner suddenly works. Also, if you try
'%9s%9s'
you'll see that the first string has its leading whitespace removed (and therefore has 3 characters "too many"), but for some reason, the last string keeps its trailing whitespace.
Obviously, this means you'd have to know exactly how many digits there are in both numbers. I'm guessing this is not desirable.
A workaround could be something like the following:
% Split string on the "dot"
dat = textscan(<your data>,'%9s%9s',...
'Delimiter' , '.',...
'CollectOutput' , true,...
'ReturnOnError' , false);
% Correct the strings; move the last digit of the first string to the
% front of the second string, and put the dot back
dat = cellfun(#(x,y) str2double({y(1:end-1), [y(end) '.' x]}), dat{1}(:,2), dat{1}(:,1), 'UniformOutput', false);
% Cast to regular array
dat = cat(1, dat{:})
I had a similar problem and solved it by calling textscan twice, which proved to be way faster than cellfun or str2double and will work with any input that can be interpreted by Matlab's '%f'
In your case I would first call textscan with only string arguments and Whitespace = '' to correctly define the width of the fields.
data = ' 9574865.0E+10 ';
tmp = textscan(data, '%9s %9s', 'Whitespace', '');
Now you need to interweave and append a delimiter that won't interfere with your data, for example ;
tmp = [char(join([tmp{:}],';',2)) ';'];
And now you can apply the right format to your data by calling textscan again with a delimiter like:
result = textscan(tmp, '%f %f', 'Delimiter', ';', 'CollectOutput', true);
format shortE
result{:}
ans =
9.5749e+05 5.0000e+10
Comparing the speed of this approach with str2double:
n = 50000;
data = repmat(' 9574865.0E+10 ', n, 1);
% Approach 1 with str2double
tic
tmp = textscan(data', '%9s %9s', 'Whitespace', '');
result1 = str2double([tmp{:}]);
toc
Elapsed time is 2.435376 seconds.
% Approach 2 with double textscan
tic
tmp = textscan(data', '%9s %9s', 'Whitespace', '');
tmp = [char(join([tmp{:}],';',2)) char(59)*ones(n,1)]; % char(59) is just ';'
result2 = cell2mat(textscan(tmp', '%f %f', 'Delimiter', ';', 'CollectOutput', true));
toc
Elapsed time is 0.098833 seconds.
Why does the expression:
test = cast(strtrim('3'), 'uint8')
produce 51?
This is also true for:
test = cast(strtrim('3'), 'int8')
Thanks.
Because 51 is the ASCII code for the character '3'.
If you want to transform the string to numeric 3, you should use
uint8(str2double('3'))
Note that str2double will ignore trailing spaces, so that strtrim isn't necessary.
EDIT
When a string is used in an numeric operation, Matlab automatically converts it to its ASCII value. For example
>> '1'+1
ans =
50
Because 51 is the ASCII value for the character '3'.
This is because '3' is seen as an ASCII character to matlab. By casting as a signed or unsigned integer (8 bits in this case) you are asking Matlab to convert an ASCII '3' to a decimal number. In this case the decimal number is 51. If you want to look at more conversions here is a basic document.
I would like to concatenate strings. I tried using strcat:
x = 5;
m = strcat('is', num2str(x))
but this function removes trailing white-space characters from each string. Is there another MATLAB function to perform string concatenation which maintains trailing white-space?
You can use horzcat instead of strcat:
>> strcat('one ','two')
ans =
onetwo
>> horzcat('one ','two')
ans =
one two
Alternatively, if you're going to be substituting numbers into strings, it might be better to use sprintf:
>> x = 5;
>> sprintf('is %d',x)
ans =
is 5
How about
strcat({' is '},{num2str(5)})
that gives
' is 5'
Have a look at the final example on the strcat documentation: try using horizontal array concatination instead of strcat:
m = ['is ', num2str(x)]
Also, have a look at sprintf for more information on string formatting (leading/trailing spaces etc.).
How about using strjoin ?
x = 5;
m ={'is', num2str(x)};
strjoin(m, ' ')
What spaces does this not take into account ? Only the spaces you haven't mentioned ! Did you mean:
m = strcat( ' is ',num2str(x) )
perhaps ?
Matlab isn't going to guess (a) that you want spaces or (b) where to put the spaces it guesses you want.
In Matlab, the following statement gives a numeric output . .
>> 'abc' + 'def'
ans =
197 199 201
In C++, the output of the following
std::string("abc") + std::string("def")
...would give the arguably more useful...
abcdef
A little more exploration gives..
>> a = 'abc'
a =
abc
>> whos
Name Size Bytes Class Attributes
a 1x3 6 char
This suggests that my variable a is a char type. However, we know that this is not equivalent to a C type char, as it is an object that knows its size dimensions etc.
Therefore, my questions are:
What use would this numeric output be?
...leading to
Why would they design it to behave like that?
Because a string in Matlab is literally just an array of char type, so it's equivalent to:
[97 98 99] + [100 101 102]
It is not set in stone that + means "concatenate". If you want string concatenation in Matlab, you can always do:
['abc' 'def']
>> x = 14.021
>> num2str(x,'%4.5f')
I want to get this as a result:
0014.02100
But, MATLAB just answers me with:
14.02100
You should use sprintf. For example:
x = 14.021
sprintf('%010.5f', x)
Note that you don't need to use num2str.
The first argument to sprintf is the format specifier which describes how the resulting text should be displayed. The specifier begins with %, the leading 0 tells sprintf to pad the string with zeros. Loosely, the .5 tells it to print five digits to the right of decimal point and the f tells it we want to format it like a floating point number.