How to find and replace a string in Matlab

How to find and replace a string in Matlab - string

So here is my problem:
I have a list of names in Matlab in a cell array.
I automatically create directories and .mat files for each name.
My problem is that some of these names contains '/' and therefore everything go wrong when I create the directory…
So I am trying to find an efficient way to find '/' and replace them.
So far I've tried to find them using the findstr function. It then gives me a cell array with the indexes where '/' appears. So when the name doesn't contain any '/' it returns {[]} and when the function find it, it returns {[i]}.
Now i'd like to have a logical condition that says if findstr is not empty then do something. I've tried with the isempty function but it doesn't work (it's never empty…)
So does anyone have a solution to this?
Thanks

Use regexprep to replace the character:
list = {'aaa', 'bb/cc', '/dd/'};
replace_from = '/'; %// character to be replaced
replace_to = '_'; %// replacing character
list_replaced = regexprep(list, replace_from, replace_to);
gives
list_replaced =
'aaa' 'bb_cc' '_dd_'

Related

Find and replace text and wrap in "href"

I am trying to find specific word in a div (id="Test") that starts with "a04" (no case). I can find and replace the words found. But I am unable to correctly use the word found in a "href" link.
I am trying the following working code that correctly identifies my search criteria. My current code is working as expected but I would like help as i do not know how to used the found work as the url id?
var test = document.getElementById("test").innerHTML
function replacetxt(){
var str_rep = document.getElementById("test").innerHTML.replace(/a04(\w)+/g,'TEST');
var temp = str_rep;
//alert(temp);
document.getElementById("test").innerHTML = temp;
}
I would like to wrap the found word in an href but i do not know how to use the found word as the url id (url.com?id=found word).
Can someone help point out how to reference the found work please?
Thanks

If you want to use your pattern with the capturing group, you could move the quantifier + inside the group or else you would only get the value of the last iteration.
\ba04(\w+)
\b word boundary to prevent the match being part of a longer word
a04 Match literally
(\w+) Capture group 1, match 1+ times a word character
Regex demo
Then you could use the first capturing group in the replacement by referring to it with $1
If the string is a04word, you would capture word in group 1.
Your code might look like:
function replacetxt(){
var elm = document.getElementById("test");
if (elm) {
elm.innerHTML = elm.innerHTML.replace(/\ba04(\w+)/g,'TEST');
}
}
replacetxt();
<div id="test">This is text a04word more text here</div>
Note that you don't have to create extra variables like var temp = str_rep;

Find index of a specific character in a string then parse the string

I have strings which looks like this [NAME LASTNAME/NAME.LAST#emailaddress/123456678]. What I want to do is parse strings which have the same format as shown above so I only get NAME LASTNAME. My psuedo idea is find the index of the first instance of /, then strip from index 1 to that index of / we found. I want this as a VBScript.

Your way should work. You can also Split() your string on / and just grab the first element of the resulting array:
Const SOME_STRING = "John Doe/John.Doe#example.com/12345678"
WScript.Echo Split(SOME_STRING, "/")(0)
Output:
John Doe
Edit, with respect to comments.
If your string contains the [, you can still Split(). Just use Mid() to grab the first element starting at character position 2:
Const SOME_STRING = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(Split(SOME_STRING, "/")(0), 2)

Your idea is good here, you should also need to grab index for "[".This will make script robust and flexible here.Below code will always return strings placed between first occurrence of "[" and "/".
var = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(var, (InStr(var,"[")+1),InStr(var,"/")-InStr(var,"[")-1)

splitting directory fileparts into sections using matlab / octave

I would like to split pathstr into separate parts how can I do this? See example below.
PS: I'm using octave 3.8.1
dpath='tmp/h1/cli/pls/03sox_a_Fs_1000/'
[pathstr,name,ext] = fileparts(dpath)
>>>pathstr = tmp/h1/cli/pls/03sox_a_Fs_1000
If all I want is 03sox_a_Fs_1000 or pls
How can I do this?
Please note the filenames will change and could be of different lengths.

You can use strsplit (here using Matlab) to split your string (believe it or not!) using the delimiter /:
pathstr = 'tmp/h1/cli/pls/03sox_a_Fs_1000'
[Name,~] = strsplit(pathstr,'/')
Now Name looks like this:
Name =
'tmp' 'h1' 'cli' 'pls' '03sox_a_Fs_1000'
So you can select the last element using the end keyword and curly braces since the output of strsplit is a cell array:
Name = Name{end}
or end-1 to retrieve pls.
This applies to names of any length or format, as long as they are separated by /.

MATLAB: Only pick filenames coinciding with some input string

Say I have a directory full of filenames such as:
1242349_blabla.wav
fdp23424_asdf.wav
o2349_0.wav
and I have an input text file listing unique IDs on each newline coinciding with numbers within these filenames (e.g. '23424' for the second filename above).
I'd like to construct a struct of filenames only containing those filenames in that directory that coincide with some ID in the input text file:
fid = fopen('input.txt');
input = textscan(fid, '%s', 'Delimiter', '\n');
filenames = dir(fullfile('/somedir/', '*.wav'));
for i = 1:length(filenames)
for j = 1:length(input)
if (strfind(input{1}(j), filenames(i).name)) ~= [])
% create new struct with chosen filenames
end
end
end
However, I get the error "undefined function 'ne' for input arguments of type 'cell'". I've tried loads of options to no avail. Also, the input evaluates to a 38x1 cell, but which has length 1, so the inner loop will only go once... Any ideas?

Regular expressions are definitely the most flexible and powerful solution. But, if your needs are simpler...you can get away with something simpler, like using wildcards in your dir command. Try something like this:
%get your file IDs from the input file
fid = fopen('input.txt');
input = textscan(fid, '%s', 'Delimiter', '\n');
IDs = input{1};
%loop over each string
myfilenames = {};
for idx = 1:length(IDs)
%get all files build off the given ID
fnames = dir(['somedir/*' IDs{idx} '*.wav']); %wildcards!
%gather the new filenames that match
for Ifname=1:length(fnames)
myfilenames{end+1}=fnames(Ifname).name;
end
end

I would use regular expressions to search for occurrences of the ID in your cell array. Regular expressions are designed to search for patterns in a particular string for you. Because you want to search for specific numbers in a set of strings, I would certainly recommend you use it. Specifically, use the regexp function, and the pattern you want to search for is the ID that you want are searching for.
How regexp works is that you can provide a cell array of strings, and the output will be another cell array where each element is a numeric array that determines the starting index of where the particular pattern you're looking for starts for a particular string in the cell array. Should the array be empty, this means that we didn't find any pattern that matched what you're looking for. If it isn't empty, then it will contain the starting index of where the ID is located in the string. This doesn't really matter - you want to determine whether the ID exists in a particular string, and so checking to see whether each array is empty is what will be useful.
As such, given your filenames that you read through dir, we can create a cell array that stores just the file names themselves, run regexp, then filter out those file names that don't contain the ID you want. Something like this:
f = dir(fullfile('/somedir/', '*.wav'));
filenames = {f.name};
ID = 23424;
check = regexp(filenames, num2str(ID));
filtered_ind = cellfun(#isempty, check);
final_files = f(~filtered_ind);
The first line of code reads the files from your desired directory. The second line of code extracts the names from each name field of the structure as a cell array. The third line is the ID you want to check for. The fourth line does a regexp call on the file names and searches for those file names that contain your desired number. Note that we need to convert the number to a string, as the pattern is expected to be a string. The next line after that finds those filenames that do not have the ID you are looking for, and the last line simply finds those files that do have the ID you're looking for.
You can then go ahead and start your processing. Specifically, you can loop over this cell array and go ahead and create your structures per element in this cell:
for i = 1:length(final_files)
s = final_files(i); %// Get the dir structure for a file that passed the ID check
%// Create your structure now...
%// ...
end
However, you have a series of IDs that you want to check. We can simply take the code above and apply a loop to it. In other words, you'd do something like:
fid = fopen('input.txt');
input = textscan(fid, '%s', 'Delimiter', '\n');
IDs = input{1};
f = dir(fullfile('/somedir/', '*.wav'));
filenames = {f.name};
for idx = 1 : length(IDs)
%// Get an ID
ID = IDs{idx};
%// Do our checking and filter out those files that don't contain our ID
check = regexp(filenames,ID);
filtered_ind = cellfun(#isempty, check);
final_files = f(~filtered_ind);
%// Do your final processing
for i = 1:length(final_files)
s = final_files(i); %// Get the dir structure for a file that passed the ID check
%// Create your structure now...
%// ...
end
end
With the above code, we open the text file, then parse each string that's in the text file and place it into a cell array called IDs. Note here that the IDs are now all strings, so there's no need to do any conversions. After, for each ID we have, we search our filenames to see which files have this ID we're looking for. We filter out those filenames that don't have this ID, then we loop over each one of these files and create our structures. We do this for each ID that we have.
Just to demonstrate that this regexp stuff is working, as a small example, let's use the three filenames you have provided with your post. I've placed these names in a cell array, then I'll run lines 3 to 5 in the code I wrote, then I will filter out those filenames that don't contain the ID we're looking for:
filenames = {'1242349_blabla.wav'; 'fdp23424_asdf.wav'; 'o2349_0.wav'};
ID = 23424;
check = regexp(filenames, num2str(ID));
filtered_ind = cellfun(#isempty, check);
final_filenames = filenames(~filtered_ind);
final_filenames is a cell array our filenames that have our ID. We thus get:
final_filenames =
'fdp23424_asdf.wav'
Good luck!

Split string by first delimiter

I have a column with a long list of folder and file names. The folders and file names vary. I want to extract the file name from the column into another column but I struggling to do this in Excel.
Example of column data:(files and folder altered to hide details that should not be public)
c:\data\1\nc2\media\ss\system media\ne\d - wnd enging works v5.swf
c:\data\1\nc2\media\ss\special campaigns\samns dec 2012\trainerv5.swf
C:\Local\Messages\17362~000000001~20131231235910~4.MUF
c:\data\1\nc2\media\ss\system media\tl\nd - tfl statusv4.swf
c:\data\1\nc2\media\ss\system media\core\ss_bagage v2.swf
I know I should be able to search from the right to the first occurence of "\" but I can't figure out the syntax.
Many thanks
UPDATE:
Formula =RIGHT(B2,LEN(B2)-SEARCH("\",B2,1)) should work, but it shows incorrect results. But If I change it to search for "." it pulls out the file extension. So there is a key item I'm missing

=RIGHT(A1,LEN(A1)-FIND("~",SUBSTITUTE(A1,"\","~",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
copy it in any column say b drag down,you are done

VBA is a more efficient option if you have many files to parse. Create a module and add the below:
Function GetFileName(file As String) As String
Set fso = CreateObject("Scripting.FileSystemObject")
GetFileName = fso.GetFileName(file)
End Function

There are several different ways to get the text following the last slash in a string, including the following formula. In this example, H15 is the cell containing the string to search. If it can't find a slash, it returns the "-" (dash) character.
=iferror(RIGHT(H15,LEN(H15)-SEARCH("|",SUBSTITUTE(H15,"/","|",LEN(H15)-LEN(SUBSTITUTE(H15,"/",""))))),"-")
The formula first finds the number of slashes in the string. LEN gives the total length of the string, and LEN of the string without slashes after using SUBSTITUTE to eliminate the slashes in the original string - the difference is the number of slashes.
Then, you substitute in a marker character(I used "|") for the last slash. By searching for the marker, you find where the bit after the slash starts. The total length of the string minus where the marker starts tells you how many characters to take from the right, which you then do.

If you need more generic string parsing and are willing to use a little bit of VBA, you can use the split function as suggested by Jamie Bull in his answer to this question on SuperUser.
His function will use any character you choose to split the string into segments and return whichever segment you choose.
I've copied Jamie's function here for convenient reference:
Function STR_SPLIT(str, sep, n) As String
Dim V() As String
V = Split(str, sep)
STR_SPLIT = V(n - 1)
End Function

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find and replace a string in Matlab - string

Use regexprep to replace the character: list = {'aaa', 'bb/cc', '/dd/'}; replace_from = '/'; %// character to be replaced replace_to = '_'; %// replacing character list_replaced = regexprep(list, replace_from, replace_to); gives list_replaced = 'aaa' 'bb_cc' '_dd_'

Related

Find and replace text and wrap in "href"

Find index of a specific character in a string then parse the string

splitting directory fileparts into sections using matlab / octave

MATLAB: Only pick filenames coinciding with some input string

Split string by first delimiter

Categories

Resources