About twelve years ago, I wrote a small VB.NET application that loads strings from files. These strings may contain one or more of the following characters: à, è, é, ì, ò, ù, ä, ö. The application uses a special custom font (JazzText Extended) that does not have those special characters. Yet, I somehow managed to make the application display words correctly in that font, and twelve years later, I have no idea how - thanks for not leaving a line of comment, past me!
The program has the following routine:
Private Sub SetWord(ByVal word() As String)
Dim nword(3) As String
nword(0) = word(0)
nword(1) = word(1)
nword(2) = word(2)
For i As Integer = 0 To 2
nword(i) = nword(i).Replace("à", "")
nword(i) = nword(i).Replace("é", "")
nword(i) = nword(i).Replace("è", "")
nword(i) = nword(i).Replace("ì", "ê")
nword(i) = nword(i).Replace("ò", "")
nword(i) = nword(i).Replace("ù", "")
nword(i) = nword(i).Replace("ä", "")
nword(i) = nword(i).Replace("ö", "")
Next
lblItaWord.Text = nword(0).ToUpper
lblEngWord.Text = nword(1).ToUpper
lblFinWord.Text = nword(2).ToUpper
End Sub
What it does is, it takes an array that contains three words, and for each of those three words, it looks if it contains any of the special characters. If it does, it replaces them with... something, makes the words all caps, and then assigns each of them to one of three labels.
In Visual Studio, the replacement characters look like empty strings. I had to put the cursor in between the quotation marks to realise that it was in fact not an empty string and there was an invisible character there. Here on SO... I'm not sure what you'll see. You might see just a square, or some other weird character. (The ê character is an exception, it seems to display in the same way everywhere.)
If you copypaste any of the invisible/square characters to Google and search for it, you'll get a different representation that uses two characters—for example, the first one translates to ‡. Using this pair in place of the invisible/square character in the Replace method does not produce the correct result. FYI, the encoding I use to read the files (the default one used by IO.StreamReader if you don't specify any encoding) works fine: if I use a more standard font, all special characters display correctly without using the SetWord sub at all.
Now, I have absolutely no idea how those characters, whatever they may be, manage to make the app display correctly the words when the font I use does not have those characters. I have no idea how I found out about this trick, either. Right now, my problem is that I would like to replace those squares/invisible characters with something intelligible, and I have no idea how. Any ideas?
I was looking for a solution and I found it here
replacing many words every one with alternative word
But now I'm using a alternative code that I've got from the link below that post, which is case sensitve.
Function SubstituteMultipleCS(text As String, old_text As Range, new_text As Range)
Dim i As Single
For i = 1 To old_text.Cells.Count
Result = Replace(text, old_text.Cells(i), new_text.Cells(i))
text = Result
Next i
SubstituteMultipleCS = Result
End Function
I'm using it to make German Anki cards so I need to replace some words with ___. It's working with one single word or a bunch of words if they are together, but...
The problem is the following:
Some verbs conjugation have a sentence structure when I must place the main verb after the noun and the particle, which belongs to the verb, at the end. Something like this
As you can see in the picture, the verb "schaute an" is not replaced by the new word because "schaute" is separated from "an" in the original sentence.
Is there any way to fix this?
thank you.
Here is a formula you may use (which works for your current sample data:
Formula in C2:
=IFERROR(TRIM(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(" "&SUBSTITUTE(B2,"."," ")&" "," "&FILTERXML("<t><s>"&SUBSTITUTE(A2," ","</s><s>")&"</s></t>","//s[position() = 1]")&" ",D2,1),IFERROR(" "&FILTERXML("<t><s>"&SUBSTITUTE(A2," ","</s><s>")&"</s></t>","//s[position() = 2]")&" ",""),D2,1),IFERROR(" "&FILTERXML("<t><s>"&SUBSTITUTE(A2," ","</s><s>")&"</s></t>","//s[position() = 3]")&" ",""),D2,1))&".","")
The advantage of nested substitutes is that we can tell the function to only replace the first occurence if you had a sentence where multiple could occur. Not sure if it's watertight.
Hi I would like to extract dynamically the numbers from string in Excel.
I have the following strings and I would like to have only the numbers before ". pdf". taken out of the string into the next column.
As you can see the number of characters varies from line to line.
I have invented something like this:
=MID(M20;SEARCH("_";M20);20)
But this takes out only the numbers after "_" and .pdf after this....
How to make it the way I like?
D:\Users\xxxx\Desktop\1610\ts25b_4462.pdf
D:\Users\xxx\Desktop\1610\ts02b_39522.pdf
D:\Users\xxxxx\Desktop\1610\ts02b_except_39511.pdf
D:\Users\xxxx\Desktop\1610\ts02b_except_39555.pdf
D:\Users\xxxx\Desktop\1610\ts22b_6118.pdf
So that I have just :
4462
39522
39511
39555
6118
and so on...
Thank you!!!
With VBA, try to do it like this:
Public Function splitThings(strInput As String) As String
splitThings = Split(Split(strInput, "_")(1), ".")(0)
End Function
Concerning your formula, try to use =LEFT(MID(M20;SEARCH("_";M20);20),K), where K is the difference of the length of ts22b_6118.pdf and 4 (.pdf). 4 is the length of .pdf.
Something like this should do the work:
=LEFT(MID(I3,SEARCH("_",I3)+1,LEN(I3)),LEN(MID(I3,SEARCH("_",I3),LEN(I3)))-5)
You should do it using Excel formula. For example:
=SUBSTITUTE(LEFT(A1,FIND(".pdf",A1)-1),LEFT(A1,FIND("_",A1)),"")
Using the first line as an example, with LEFT(A1,FIND(".pdf",A1)-1) you will have D:\Users\xxxx\Desktop\1610\ts25b_4462 and with the LEFT(A1,FIND("_",A1)) D:\Users\xxxx\Desktop\1610\ts25b_, if you SUBSTITUTE the first part by "" you will have 4462.
Hope this can help.
With this formula, you should be able to get the numbers you want:
=MID(A1,FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))+1,FIND(".",A1)-FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))-1)
Basically, this is the initial fomula:
=MID(A1,FIND("_",A1)+1,FIND(".",A1)-FIND("_",A1)-1)
But since there may be two _ in the string so this is the one to find the 2nd _:
=SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_","")))
Now just replace this SUBSTITUTE with A1 above and you get that long formula. Hope this helps.
This will return the number you want regardless of extension (could be .pdf, could be .xlsx, etc) and regardless of the number of underscores present in the filename and/or filepath:
=TRIM(LEFT(RIGHT(SUBSTITUTE(SUBSTITUTE(M20,".",REPT(" ",LEN(M20))),"_",REPT(" ",LEN(M20))),LEN(M20)*2),LEN(M20)))
I have an Excel spreadsheet containing a list of strings. Each string is made up of several words, but the number of words in each string is different.
Using built in Excel functions (no VBA), is there a way to isolate the last word in each string?
Examples:
Are you classified as human? -> human?
Negative, I am a meat popsicle -> popsicle
Aziz! Light! -> Light!
This one is tested and does work (based on Brad's original post):
=RIGHT(A1,LEN(A1)-FIND("|",SUBSTITUTE(A1," ","|",
LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
If your original strings could contain a pipe "|" character, then replace both in the above with some other character that won't appear in your source. (I suspect Brad's original was broken because an unprintable character was removed in the translation).
Bonus: How it works (from right to left):
LEN(A1)-LEN(SUBSTITUTE(A1," ","")) – Count of spaces in the original string
SUBSTITUTE(A1," ","|", ... ) – Replaces just the final space with a |
FIND("|", ... ) – Finds the absolute position of that replaced | (that was the final space)
Right(A1,LEN(A1) - ... )) – Returns all characters after that |
EDIT: to account for the case where the source text contains no spaces, add the following to the beginning of the formula:
=IF(ISERROR(FIND(" ",A1)),A1, ... )
making the entire formula now:
=IF(ISERROR(FIND(" ",A1)),A1, RIGHT(A1,LEN(A1) - FIND("|",
SUBSTITUTE(A1," ","|",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))))
Or you can use the =IF(COUNTIF(A1,"* *") syntax of the other version.
When the original string might contain a space at the last position add a trim function while counting all the spaces: Making the function the following:
=IF(ISERROR(FIND(" ",B2)),B2, RIGHT(B2,LEN(B2) - FIND("|",
SUBSTITUTE(B2," ","|",LEN(TRIM(B2))-LEN(SUBSTITUTE(B2," ",""))))))
This is the technique I've used with great success:
=TRIM(RIGHT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
To get the first word in a string, just change from RIGHT to LEFT
=TRIM(LEFT(SUBSTITUTE(A1, " ", REPT(" ", 100)), 100))
Also, replace A1 by the cell holding the text.
A more robust version of Jerry's answer:
=TRIM(RIGHT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(TRIM(A1))))
That works regardless of the length of the string, leading or trailing spaces, or whatever else and it's still pretty short and simple.
I found this on google, tested in Excel 2003 & it works for me:
=IF(COUNTIF(A1,"* *"),RIGHT(A1,LEN(A1)-LOOKUP(LEN(A1),FIND(" ",A1,ROW(INDEX($A:$A,1,1):INDEX($A:$A,LEN(A1),1))))),A1)
[edit] I don't have enough rep to comment, so this seems the best place...BradC's answer also doesn't work with trailing spaces or empty cells...
[2nd edit] actually, it doesn't work for single words either...
=RIGHT(TRIM(A1),LEN(TRIM(A1))-FIND(CHAR(7),SUBSTITUTE(" "&TRIM(A1)," ",CHAR(7),
LEN(TRIM(A1))-LEN(SUBSTITUTE(" "&TRIM(A1)," ",""))+1))+1)
This is very robust--it works for sentences with no spaces, leading/trailing spaces, multiple spaces, multiple leading/trailing spaces... and I used char(7) for the delimiter rather than the vertical bar "|" just in case that is a desired text item.
This is very clean and compact, and works well.
{=RIGHT(A1,LEN(A1)-MAX(IF(MID(A1,ROW(1:999),1)=" ",ROW(1:999),0)))}
It does not error trap for no spaces or one word, but that's easy to add.
Edit:
This handles trailing spaces, single word, and empty cell scenarios. I have not found a way to break it.
{=RIGHT(TRIM(A1),LEN(TRIM(A1))-MAX(IF(MID(TRIM(A1),ROW($1:$999),1)=" ",ROW($1:$999),0)))}
=RIGHT(A1,LEN(A1)-FIND("`*`",SUBSTITUTE(A1," ","`*`",LEN(A1)-LEN(SUBSTITUTE(A1," ","")))))
New answer 9/28/2022
Considering the new excel function: TEXTAFTER (check availability) you can achieve it with a simple formula:
=TEXTAFTER(A1," ", -1)
To add to Jerry and Joe's answers, if you're wanting to find the text BEFORE the last word you can use:
=TRIM(LEFT(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))), LEN(SUBSTITUTE(TRIM(A1), " ", REPT(" ", LEN(TRIM(A1)))))-LEN(TRIM(A1))))
With 'My little cat' in A1 would result in 'My little' (where Joe and Jerry's would give 'cat'
In the same way that Jerry and Joe isolate the last word, this then just gets everything to the left of that (then trims it back)
Copy into a column, select that column and HOME > Editing > Find & Select, Replace:
Replace All.
There is a space after the asterisk.
Imagine the string could be reversed. Then it is really easy. Instead of working on the string:
"My little cat" (1)
you work with
"tac elttil yM" (2)
With =LEFT(A1;FIND(" ";A1)-1) in A2 you get "My" with (1) and "tac" with (2), which is reversed "cat", the last word in (1).
There are a few VBAs around to reverse a string. I prefer the public VBA function ReverseString.
Install the above as described. Then with your string in A1, e.g., "My little cat" and this function in A2:
=ReverseString(LEFT(ReverseString(A1);IF(ISERROR(FIND(" ";A1));
LEN(A1);(FIND(" ";ReverseString(A1))-1))))
you'll see "cat" in A2.
The method above assumes that words are separated by blanks. The IF clause is for cells containing single words = no blanks in cell. Note: TRIM and CLEAN the original string are useful as well. In principle it reverses the whole string from A1 and simply finds the first blank in the reversed string which is next to the last (reversed) word (i.e., "tac "). LEFT picks this word and another string reversal reconstitutes the original order of the word (" cat"). The -1 at the end of the FIND statement removes the blank.
The idea is that it is easy to extract the first(!) word in a string with LEFT and FINDing the first blank. However, for the last(!) word the RIGHT function is the wrong choice when you try to do that because unfortunately FIND does not have a flag for the direction you want to analyse your string.
Therefore the whole string is simply reversed. LEFT and FIND work as normal but the extracted string is reversed. But his is no big deal once you know how to reverse a string. The first ReverseString statement in the formula does this job.
=LEFT(A1,FIND(IF(
ISERROR(
FIND("_",A1)
),A1,RIGHT(A1,
LEN(A1)-FIND("~",
SUBSTITUTE(A1,"_","~",
LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))
)
)
)
),A1,1)-2)
I translated to PT-BR, as I needed this as well.
(Please note that I've changed the space to \ because I needed the filename only of path strings.)
=SE(ÉERRO(PROCURAR("\",A1)),A1,DIREITA(A1,NÚM.CARACT(A1)-PROCURAR("|", SUBSTITUIR(A1,"\","|",NÚM.CARACT(A1)-NÚM.CARACT(SUBSTITUIR(A1,"\",""))))))
Another way to achieve this is as below
=IF(ISERROR(TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14))))),TRIM(D14),TRIM(MID(TRIM(D14),SEARCH("|",SUBSTITUTE(TRIM(D14)," ","|",LEN(TRIM(D14))-LEN(SUBSTITUTE(TRIM(D14)," ","")))),LEN(TRIM(D14)))))
You can achieve this also by reversing the string and finding the first space
=MID(C3,2+LEN(C3)-SEARCH(" ",CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))),LEN(A1))
Reverse the string
CONCAT(MID(C3,SEQUENCE(LEN(C3),,LEN(C3),-1),1))
Find the first space in the reversed string
SEARCH(" ",...
Take the position of the space found in the reversed string off the length of the string and return that portion
=MID(C3,2+LEN(C3)-SEARCH...
I also had a task like this and when I was done, using the above method, a new method occured to me: Why don't you do this:
Reverse the string ("string one" becomes "eno gnirts").
Use the good old Find (which is hardcoded for left-to-right).
Reverse it into readable string again.
How does this sound?