how to replace a portion of a string in matlab - string

I have a string matrix filled with data like
matrix = ['1231231.jpeg','4343.jpeg',...]
and I want to remove its file extension and get
matrix = ['1231231', '4343']
How can I do it? is there any function or what :)

User fileparts, it returns three variables the path, name, and extension of the file. So this should work for you
[~, fName, ext] = fileparts(fileName)

Assuming the matrix looks like
matrix = ['1231231.jpeg';
'4343.jpeg';
....];
(; instead of ,). If ',' is used, the chars in the matrix are concatinated automatically.
You can use arrayfun to perform an operation on each index of a matrix. The following command should work
arrayfun(#(x) matrix(x,1:strfind(a(matrix,:),'.jpeg')-1), str2num(matrix(:,1))', 'UniformOutput' , false)

there is a function for this in matlab
http://www.mathworks.com/help/techdoc/ref/fileparts.html
file = 'H:\user4\matlab\classpath.txt';
[pathstr, name, ext] = fileparts(file)
pathstr =
H:\user4\matlab
name =
classpath
ext =
.txt

You could always loop through them and parse them like:
r[i] = regexp(char(string), '(?<dec>\d*).(?<ext>\w*)', 'names');
use r[i].dec for the number value.

Note: A matrix from (vertical) concatenation of strings with different lengths will not work (except for the special case of equal-length strings). Each char is treated as a single matrix element by vertcat when calling[A;B].
Alternative, using cell array and cellfun (+independent of file extensions):
matrix = {'1231231.jpeg','4343.jpeg'};
matrix_name = cellfun(#(x) x(1:find(x == '.', 1, 'last')-1), matrix, 'UniformOutput', false);

Related

Idiomatic way to split string in Groovy

Is there a nicer/shorter/better way of performing the following:
filename = "AA_BB_CC_DD_EE_FF.xyz"
parts = filename.split("_")
packageName = "${parts[0]}_${parts[1]}_${parts[2]}_${parts[3]}"
//packageName == "AA_BB_CC_DD"
The format remains constant (6 parts, _ separator) but some of the values and lengths of AA,BB are variable.
You can do the same thing by just programming the "joining" part differently:
The following result in the same thing as packageName:
filename.split('_')[0..3].join('_')
It just uses a range to slice the array, and .join to concatenate with a delimiter.
As the separator char between the "segments" in the source filename and in the
result is the same (_), you don't need to split the filename and join the parts again.
Your task can be done with a single regex:
def result = filename.find(/([A-Z0-9]+_){3}[A-Z0-9]+/)

Removing first character from a string in octave

I wanted to know how to remove first character of a string in octave. I am manipulating the string in a loop and after every loop, I want to remove the first character of the remaining string.
Thanks in advance.
If it's just a one-line string then:
short_string = long_string(2:end)
But if you have a cell array of strings then either do it as above if you have a loop already, otherwise you can use this shorthand to do it in one line:
short_strings = cellfun(#(x)(x(2:end)), long_strings, 'uni', false)
Or else if you have a matrix of strings (i.e. all the same length), then you can vectorize it as:
short_strings = long_strings(:, 2:end)

Error reading a fixed-width string with textscan in MATLAB

I'm reading fixed-width (9 characters) data from a text file using textscan. Textscan fails at a certain line containing the string:
' 9574865.0E+10 '
I would like to read two numbers from this:
957486 5.0E+10
The problem can be replicated like this:
dat = textscan(' 9574865.0E+10 ','%9f %9f','Delimiter','','CollectOutput',true,'ReturnOnError',false);
The following error is returned:
Error using textscan
Mismatch between file and format string.
Trouble reading floating point number from file (row 1u, field 2u) ==> E+10
Surprisingly, if we add a minus, we don't get an error, but a wrong result:
dat = textscan(' -9574865.0E+10 ','%9f %9f','Delimiter','','CollectOutput',true,'ReturnOnError',false);
Now dat{1} is:
-9574865 0
Obviously, I need both cases to work. My current workaround is to add commas between the fields and use commas as a delimiter in textscan, but that's slow and not a nice solution. Is there any way I can read this string correctly using textscan or another built-in (for performance reasons) MATLAB function?
I suspect textscan first trims leading white space, and then parses the format string. I think this, because if you change yuor format string from
'%9f%9f'
to
'%6f%9f'
your one-liner suddenly works. Also, if you try
'%9s%9s'
you'll see that the first string has its leading whitespace removed (and therefore has 3 characters "too many"), but for some reason, the last string keeps its trailing whitespace.
Obviously, this means you'd have to know exactly how many digits there are in both numbers. I'm guessing this is not desirable.
A workaround could be something like the following:
% Split string on the "dot"
dat = textscan(<your data>,'%9s%9s',...
'Delimiter' , '.',...
'CollectOutput' , true,...
'ReturnOnError' , false);
% Correct the strings; move the last digit of the first string to the
% front of the second string, and put the dot back
dat = cellfun(#(x,y) str2double({y(1:end-1), [y(end) '.' x]}), dat{1}(:,2), dat{1}(:,1), 'UniformOutput', false);
% Cast to regular array
dat = cat(1, dat{:})
I had a similar problem and solved it by calling textscan twice, which proved to be way faster than cellfun or str2double and will work with any input that can be interpreted by Matlab's '%f'
In your case I would first call textscan with only string arguments and Whitespace = '' to correctly define the width of the fields.
data = ' 9574865.0E+10 ';
tmp = textscan(data, '%9s %9s', 'Whitespace', '');
Now you need to interweave and append a delimiter that won't interfere with your data, for example ;
tmp = [char(join([tmp{:}],';',2)) ';'];
And now you can apply the right format to your data by calling textscan again with a delimiter like:
result = textscan(tmp, '%f %f', 'Delimiter', ';', 'CollectOutput', true);
format shortE
result{:}
ans =
9.5749e+05 5.0000e+10
Comparing the speed of this approach with str2double:
n = 50000;
data = repmat(' 9574865.0E+10 ', n, 1);
% Approach 1 with str2double
tic
tmp = textscan(data', '%9s %9s', 'Whitespace', '');
result1 = str2double([tmp{:}]);
toc
Elapsed time is 2.435376 seconds.
% Approach 2 with double textscan
tic
tmp = textscan(data', '%9s %9s', 'Whitespace', '');
tmp = [char(join([tmp{:}],';',2)) char(59)*ones(n,1)]; % char(59) is just ';'
result2 = cell2mat(textscan(tmp', '%f %f', 'Delimiter', ';', 'CollectOutput', true));
toc
Elapsed time is 0.098833 seconds.

MATLAB search cell array for string subset

I'm trying to find the locations where a substring occurs in a cell array in MATLAB. The code below works, but is rather ugly. It seems to me there should be an easier solution.
cellArray = [{'these'} 'are' 'some' 'nicewords' 'and' 'some' 'morewords'];
wordPlaces = cellfun(#length,strfind(cellArray,'words'));
wordPlaces = find(wordPlaces); % Word places is the locations.
cellArray(wordPlaces);
This is similar to, but not the same as this and this.
The thing to do is to encapsulate this idea as a function. Either inline:
substrmatch = #(x,y) ~cellfun(#isempty,strfind(y,x))
findmatching = #(x,y) y(substrmatch(x,y))
Or contained in two m-files:
function idx = substrmatch(word,cellarray)
idx = ~cellfun(#isempty,strfind(word,cellarray))
and
function newcell = findmatching(word,oldcell)
newcell = oldcell(substrmatch(word,oldcell))
So now you can just type
>> findmatching('words',cellArray)
ans =
'nicewords' 'morewords'
I don't know if you would consider it a simpler solution than yours, but regular expressions are a very good general-purpose utility I often use for searching strings. One way to extract the cells from cellArray that contains words with 'words' in them is as follows:
>> matches = regexp(cellArray,'^.*words.*$','match'); %# Extract the matches
>> matches = [matches{:}] %# Remove empty cells
matches =
'nicewords' 'morewords'

How do I put variable values into a text string in MATLAB?

I'm trying to write a simple function that takes two inputs, x and y, and passes these to three other simple functions that add, multiply, and divide them. The main function should then display the results as a string containing x, y, and the totals.
I think there's something I'm not understanding about output arguments. Anyway, here's my (pitiful) code:
function a=addxy(x,y)
a=x+y;
function b=mxy(x,y)
b=x*y;
function c=dxy(x,y)
c=x/y;
The main function is:
function [d e f]=answer(x,y)
d=addxy(x,y);
e=mxy(x,y);
f=dxy(x,y);
z=[d e f]
How do I get the values for x, y, d, e, and f into a string? I tried different matrices and stuff like:
['the sum of' x 'and' y 'is' d]
but none of the variables are showing up.
Two additional issues:
Why is the function returning "ans 3" even though I didn't ask for the length of z?
If anyone could recommend a good book for beginners to MATLAB scripting I'd really appreciate it.
Here's how you convert numbers to strings, and join strings to other things (it's weird):
>> ['the number is ' num2str(15) '.']
ans =
the number is 15.
You can use fprintf/sprintf with familiar C syntax. Maybe something like:
fprintf('x = %d, y = %d \n x+y=%d \n x*y=%d \n x/y=%f\n', x,y,d,e,f)
reading your comment, this is how you use your functions from the main program:
x = 2;
y = 2;
[d e f] = answer(x,y);
fprintf('%d + %d = %d\n', x,y,d)
fprintf('%d * %d = %d\n', x,y,e)
fprintf('%d / %d = %f\n', x,y,f)
Also for the answer() function, you can assign the output values to a vector instead of three distinct variables:
function result=answer(x,y)
result(1)=addxy(x,y);
result(2)=mxy(x,y);
result(3)=dxy(x,y);
and call it simply as:
out = answer(x,y);
As Peter and Amro illustrate, you have to convert numeric values to formatted strings first in order to display them or concatenate them with other character strings. You can do this using the functions FPRINTF, SPRINTF, NUM2STR, and INT2STR.
With respect to getting ans = 3 as an output, it is probably because you are not assigning the output from answer to a variable. If you want to get all of the output values, you will have to call answer in the following way:
[out1,out2,out3] = answer(1,2);
This will place the value d in out1, the value e in out2, and the value f in out3. When you do the following:
answer(1,2)
MATLAB will automatically assign the first output d (which has the value 3 in this case) to the default workspace variable ans.
With respect to suggesting a good resource for learning MATLAB, you shouldn't underestimate the value of the MATLAB documentation. I've learned most of what I know on my own using it. You can access it online, or within your copy of MATLAB using the functions DOC, HELP, or HELPWIN.
I just realized why I was having so much trouble - in MATLAB you can't store strings of different lengths as an array using square brackets. Using square brackets concatenates strings of varying lengths into a single character array.
>> a=['matlab','is','fun']
a =
matlabisfun
>> size(a)
ans =
1 11
In a character array, each character in a string counts as one element, which explains why the size of a is 1X11.
To store strings of varying lengths as elements of an array, you need to use curly braces to save as a cell array. In cell arrays, each string is treated as a separate element, regardless of length.
>> a={'matlab','is','fun'}
a =
'matlab' 'is' 'fun'
>> size(a)
ans =
1 3
I was looking for something along what you wanted, but wanted to put it back into a variable.
So this is what I did
variable = ['hello this is x' x ', this is now y' y ', finally this is d:' d]
basically
variable = [str1 str2 str3 str4 str5 str6]

Resources