splitting directory fileparts into sections using matlab / octave - string

I would like to split pathstr into separate parts how can I do this? See example below.
PS: I'm using octave 3.8.1
dpath='tmp/h1/cli/pls/03sox_a_Fs_1000/'
[pathstr,name,ext] = fileparts(dpath)
>>>pathstr = tmp/h1/cli/pls/03sox_a_Fs_1000
If all I want is 03sox_a_Fs_1000 or pls
How can I do this?
Please note the filenames will change and could be of different lengths.

You can use strsplit (here using Matlab) to split your string (believe it or not!) using the delimiter /:
pathstr = 'tmp/h1/cli/pls/03sox_a_Fs_1000'
[Name,~] = strsplit(pathstr,'/')
Now Name looks like this:
Name =
'tmp' 'h1' 'cli' 'pls' '03sox_a_Fs_1000'
So you can select the last element using the end keyword and curly braces since the output of strsplit is a cell array:
Name = Name{end}
or end-1 to retrieve pls.
This applies to names of any length or format, as long as they are separated by /.

Related

How to extract the characters from a string in Excel

Hi I would like to extract dynamically the numbers from string in Excel.
I have the following strings and I would like to have only the numbers before ". pdf". taken out of the string into the next column.
As you can see the number of characters varies from line to line.
I have invented something like this:
=MID(M20;SEARCH("_";M20);20)
But this takes out only the numbers after "_" and .pdf after this....
How to make it the way I like?
D:\Users\xxxx\Desktop\1610\ts25b_4462.pdf
D:\Users\xxx\Desktop\1610\ts02b_39522.pdf
D:\Users\xxxxx\Desktop\1610\ts02b_except_39511.pdf
D:\Users\xxxx\Desktop\1610\ts02b_except_39555.pdf
D:\Users\xxxx\Desktop\1610\ts22b_6118.pdf
So that I have just :
4462
39522
39511
39555
6118
and so on...
Thank you!!!
With VBA, try to do it like this:
Public Function splitThings(strInput As String) As String
splitThings = Split(Split(strInput, "_")(1), ".")(0)
End Function
Concerning your formula, try to use =LEFT(MID(M20;SEARCH("_";M20);20),K), where K is the difference of the length of ts22b_6118.pdf and 4 (.pdf). 4 is the length of .pdf.
Something like this should do the work:
=LEFT(MID(I3,SEARCH("_",I3)+1,LEN(I3)),LEN(MID(I3,SEARCH("_",I3),LEN(I3)))-5)
You should do it using Excel formula. For example:
=SUBSTITUTE(LEFT(A1,FIND(".pdf",A1)-1),LEFT(A1,FIND("_",A1)),"")
Using the first line as an example, with LEFT(A1,FIND(".pdf",A1)-1) you will have D:\Users\xxxx\Desktop\1610\ts25b_4462 and with the LEFT(A1,FIND("_",A1)) D:\Users\xxxx\Desktop\1610\ts25b_, if you SUBSTITUTE the first part by "" you will have 4462.
Hope this can help.
With this formula, you should be able to get the numbers you want:
=MID(A1,FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))+1,FIND(".",A1)-FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))-1)
Basically, this is the initial fomula:
=MID(A1,FIND("_",A1)+1,FIND(".",A1)-FIND("_",A1)-1)
But since there may be two _ in the string so this is the one to find the 2nd _:
=SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_","")))
Now just replace this SUBSTITUTE with A1 above and you get that long formula. Hope this helps.
This will return the number you want regardless of extension (could be .pdf, could be .xlsx, etc) and regardless of the number of underscores present in the filename and/or filepath:
=TRIM(LEFT(RIGHT(SUBSTITUTE(SUBSTITUTE(M20,".",REPT(" ",LEN(M20))),"_",REPT(" ",LEN(M20))),LEN(M20)*2),LEN(M20)))

How to find and replace a string in Matlab

So here is my problem:
I have a list of names in Matlab in a cell array.
I automatically create directories and .mat files for each name.
My problem is that some of these names contains '/' and therefore everything go wrong when I create the directory…
So I am trying to find an efficient way to find '/' and replace them.
So far I've tried to find them using the findstr function. It then gives me a cell array with the indexes where '/' appears. So when the name doesn't contain any '/' it returns {[]} and when the function find it, it returns {[i]}.
Now i'd like to have a logical condition that says if findstr is not empty then do something. I've tried with the isempty function but it doesn't work (it's never empty…)
So does anyone have a solution to this?
Thanks
Use regexprep to replace the character:
list = {'aaa', 'bb/cc', '/dd/'};
replace_from = '/'; %// character to be replaced
replace_to = '_'; %// replacing character
list_replaced = regexprep(list, replace_from, replace_to);
gives
list_replaced =
'aaa' 'bb_cc' '_dd_'

rstrip() has no effect on string

Trying to use rstrip() at its most basic level, but it does not seem to have any effect at all.
For example:
string1='text&moretext'
string2=string1.rstrip('&')
print(string2)
Desired Result:
text
Actual Result:
text&moretext
Using Python 3, PyScripter
What am I missing?
someString.rstrip(c) removes all occurences of c at the end of the string. Thus, for example
'text&&&&'.rstrip('&') = 'text'
Perhaps you want
'&'.join(string1.split('&')[:-1])
This splits the string on the delimiter "&" into a list of strings, removes the last one, and joins them again, using the delimiter "&". Thus, for example
'&'.join('Hello&World'.split('&')[:-1]) = 'Hello'
'&'.join('Hello&Python&World'.split('&')[:-1]) = 'Hello&Python'

Split string by first delimiter

I have a column with a long list of folder and file names. The folders and file names vary. I want to extract the file name from the column into another column but I struggling to do this in Excel.
Example of column data:(files and folder altered to hide details that should not be public)
c:\data\1\nc2\media\ss\system media\ne\d - wnd enging works v5.swf
c:\data\1\nc2\media\ss\special campaigns\samns dec 2012\trainerv5.swf
C:\Local\Messages\17362~000000001~20131231235910~4.MUF
c:\data\1\nc2\media\ss\system media\tl\nd - tfl statusv4.swf
c:\data\1\nc2\media\ss\system media\core\ss_bagage v2.swf
I know I should be able to search from the right to the first occurence of "\" but I can't figure out the syntax.
Many thanks
UPDATE:
Formula =RIGHT(B2,LEN(B2)-SEARCH("\",B2,1)) should work, but it shows incorrect results. But If I change it to search for "." it pulls out the file extension. So there is a key item I'm missing
=RIGHT(A1,LEN(A1)-FIND("~",SUBSTITUTE(A1,"\","~",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
copy it in any column say b drag down,you are done
VBA is a more efficient option if you have many files to parse. Create a module and add the below:
Function GetFileName(file As String) As String
Set fso = CreateObject("Scripting.FileSystemObject")
GetFileName = fso.GetFileName(file)
End Function
There are several different ways to get the text following the last slash in a string, including the following formula. In this example, H15 is the cell containing the string to search. If it can't find a slash, it returns the "-" (dash) character.
=iferror(RIGHT(H15,LEN(H15)-SEARCH("|",SUBSTITUTE(H15,"/","|",LEN(H15)-LEN(SUBSTITUTE(H15,"/",""))))),"-")
The formula first finds the number of slashes in the string. LEN gives the total length of the string, and LEN of the string without slashes after using SUBSTITUTE to eliminate the slashes in the original string - the difference is the number of slashes.
Then, you substitute in a marker character(I used "|") for the last slash. By searching for the marker, you find where the bit after the slash starts. The total length of the string minus where the marker starts tells you how many characters to take from the right, which you then do.
If you need more generic string parsing and are willing to use a little bit of VBA, you can use the split function as suggested by Jamie Bull in his answer to this question on SuperUser.
His function will use any character you choose to split the string into segments and return whichever segment you choose.
I've copied Jamie's function here for convenient reference:
Function STR_SPLIT(str, sep, n) As String
Dim V() As String
V = Split(str, sep)
STR_SPLIT = V(n - 1)
End Function

How to store string matrix and write to a file?

I don't know if Matlab can do this, but I want to store some strings in a 4×3 matrix, each element in the matrix is a string.
test_string_01 test_string_02 test_string_03
test_string_04 test_string_05 test_string_06
test_string_07 test_string_08 test_string_09
test_string_10 test_string_11 test_string_12
Then, I want to write this matrix into a plain text file, either comma or space delimited.
test_string_01,test_string_02,test_string_03
test_string_04,test_string_05,test_string_06
test_string_07,test_string_08,test_string_09
test_string_10,test_string_11,test_string_12
Seems like matrix data type is not capable of storing strings. I looked at cell. I tried to use dlmwrite() or csvwrite(), but both of them only accept matrices. I also tried cell2mat() first, but in that way all letters in the strings are comma seperated, like
t,e,s,t,_,s,t,r,i,n,g,_,0,1,t,e,s,t,_,s,t,r,i,n,g,_,0,2,t,e,s,t,_,s,t,r,i,n,g,_,0,3
So is there any way to achieve this?
It is possible to shorten yuk's solution a bit.
strings = {
'test_string_01','test_string_02','test_string_03'
'test_string_04','test_string_05','test_string_06'
'test_string_07','test_string_08','test_string_09'
'test_string_10','test_string_11','test_string_12'};
fid = fopen('output.txt','w');
fmtString = [repmat('%s\t',1,size(strings,2)-1),'%s\n'];
fprintf(fid,fmtString,strings{:});
fclose(fid);
Cell array is the way to store strings.
I agree it's a pain to save strings into a text file, but you can do it with this code:
strings = {
'test_string_01','test_string_02','test_string_03'
'test_string_04','test_string_05','test_string_06'
'test_string_07','test_string_08','test_string_09'
'test_string_10','test_string_11','test_string_12'};
fid = fopen('output.txt','w');
for row = 1:size(strings,1)
fprintf(fid, repmat('%s\t',1,size(strings,2)-1), strings{row,1:end-1});
fprintf(fid, '%s\n', strings{row,end});
end
fclose(fid);
Substitute \t with , to get csv file.
You can also store cell array of strings into Excel file with XLSWRITE (requires COM interface, so it's on Windows only):
xlswrite('output.xls',strings)
In most cases you can use the delimiter ' ' and get Matlab to save a string into file with dlmwrite.
For example,
output=('my_first_String');
dlmwrite('myfile.txt',output,'delimiter','')
will save a file named myfile.txt containing my_first_String.

Resources