Is it possible to compare two Strings in a 2013 sharepoint workflow?
e.g. in an If statement, If "Apple" less than "Banana" ...
Expected result is true: "Apple" is less than "Banana" (because "Apple" is alphabetically before "Banana")
Create a String variable named alphabet and set the value aAbBcCdDeEfFgGhHiIgGkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ there. Please doublecheck if it fits English Alphabet because I learned it 27 years ago.
Copy the first character from both of the Strings Apple and Banana into substrings.
Use the action find Substring in String against both Substrings. Save the result to Integer variables String1index and String2index.
Add usual condition If String1index is less than or equal to String2index
Related
I have a field that has first and last names. Some names include a middle initial, some names include a suffix.
I am trying to find a formula that only pulls the last name regardless of which format it is in.
Example format
Donald P Bellisario --> Bellisario
Dale Earnhardt Jr --> Earnhardt
Jimmy M Butler III--> Butler
Kanye E West--> West
Joseph Biden--> Biden
Formula 1: =TRIM(RIGHT(SUBSTITUTE(AS9," ",REPT(" ",LEN(AS9))),LEN(AS9)))
Formula 2:=RIGHT(AS9,LEN(AS9)-FIND("*",SUBSTITUTE(AS9," ","*",LEN(AS9)-LEN(SUBSTITUTE(AS9," ","")))))
Formula 1 and 2 do not ignore suffixes and will list those if existent Jack Smith Jr--> Jr
Formula 3: =SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(TRIM(SUBSTITUTE($AS9,IFERROR(RIGHT($AS9,LEN(AS9)-FIND(" ",$AS9)-10),""),""))," ",REPT(" ","99")),99)),",","")
Formula 3 will only include 10 characters after the end of the first name without displaying the middle initial. E.G(Heisenberger--> Heisenberg)
Truth is, working with names can be subject to various edge-cases that will prove a working solution wrong at some point. But for those samples shown I'd use FILTERXML() to "split" these input strings on the spaces and use xpath expressions to filter out those substrings:
Formula in B1:
=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''][translate(., 'aeiouAEIOU', '')!=.][last()]")
The trick here is that there are three coherent xpath expressions working together:
[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''] - Assert that node is not nothing when all uppercase and dots have been substituted with nothing.
[translate(., 'aeiouAEIOU', '')!=.] - Assert that node is not equal to its original node when all vowels (upper- and lowercase) have been substituted with nothing.
[last()] - The last() function returns an integer equal to the context size from the expression evaluation context, and thus it will return the last node that compiled testing against previous expressions.
I'd guess that depending on possible edge-cases you could add more rules to the equation. For a more comprehensive insight on these expressions you could have a look here.
Good luck.
Question relates to Excel (Office365):
I am seeking a solution that will extract a number with a length of 4 digits from a string.
A couple of examples of the type of strings I am referring to are:
"16016KT 9999 SCT030"
"PROB30 0500 FG BKN001"
"MOD TURB BLW 5000FT TILL302300"
"INTER 6000 SHRA SCT015"
In each of the above strings there are a combination of letters and numbers of varying lengths and no set pattern.
The sequence of characters that I am interested in are the 4 digit numbers (in BOLD). Not, the 5000 in 5000ft.
The sequence of 4 digits is unique to all the strings I will be evaluating.
Thanks!
You may use:
=IFERROR(TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0][string-length()=4]"),"0000"),"Non found")
On more recent versions of Excel, you may try:
=RegexpFind(A1, "\b[0-9]{4}\b", 0)
See here for how to activate regex support in Excel.
another solution:
=IFERROR(TEXT(UNIQUE(SEQUENCE(9999)/(FIND(" " & TEXT(SEQUENCE(9999),"0000") &" ",A2)>0),,1),"0000"),"")
Another option
In B1, formula copied down :
=IFERROR(TEXT(0+MID(A1,SEARCH(" ???? ",A1)+1,4),"0000"),"not found")
I have a data frame that has a column with text data in it. I want to remove words that mean nothing and convert negations like "isn't" to "is not" from the text data. Because when I remove the punctuations "isn't" becomes "isn t" and when I will remove words having letters less than length 2 "t" will be deleted completely. So, I want to do the following 3 tasks-
1) convert negations like "isn't" to "is not"
2) remove words that mean nothing
3) remove less than length 2 letters
For eg, the df column looks similar to this-
user_id text data column
1 it's the coldest day
2 they aren't going
3 aa
4 how are you jkhf
5 v
6 ps
7 jkhf
The output should be-
user_id text data column
1 it is the coldest day
2 they are not going
3
4 how are you
5
6
7
How to implement this?
def is_repetitive(w):
"""Predicate, true for words like jj or aaaaa."""
w = str(w) # caller should have provided a single word as input
return len(w) > 1 and all((c == w[0] for c in w[1:]))
Feed all words in the corpus to that function,
to accumulate a list of repetitive words.
Then add such words to your list of stop words.
1) Use SpaCy or NLTK's lemmatization tools to convert strings (though they do other things like convert plural to singular as well - so you may end up needing to write your own code to do this).
2) Use stopwords from NLTK or spacy to remove the obvious stop words. Alternatively, feed them your own list of stop words (their default stop words are things like is, a, the).
3)Use a basic filter, if len<2 remove row
I have a simplified table for this problem example
Column A | Column B | Column C
war | 1 | war
War | 2
warred | 3
war and peace | 4
awful war | 5
dead war horse | 6
Now I need to find all rows containing the word "war" that is not case sensitive, but must be a separate word, not a part of another word.
For example
=SUMIF(A1:A6;"C1";B1:B6)
right now finds only values "war" and "War" and SUM is 3.
I want it to find also values "war and peace", "awful war" and "dead war horse" since they all contain the word "war" and the SUM value should be 18.
I can't use search term
"*war*"
since this also includes the value "warred" and this is a separate word and shouldn't match.
One possibility is to create 4 different SUMIF-s with terms
war
war_*
*_war
*_war_*
_ is space
and then sum those four, but this is not that elegant.
I thought SUMPRODUCT with EXACT would work, but this seems to work over columns, not rows and EXACT isn't suitable..it think.
Is there a way to match row based on word that is not case sensitive and then sum all the values in Column B that have a matching row?
You could use:
=SUMPRODUCT((ISNUMBER(SEARCH(" "&C1&" "," "&A1:A6&" ")))*B1:B6)
I have a string variable with short text strings. I want to replace all the text strings with numbers based on key words contained inside the individual cells.
Example: Some cells states "I like cats", while others "I dont like the smell of wet dog".
I want to assign the value 1 to all cells containing the word cat, and the number 2 to all cells containing the word dog.
How do I do this?
This will put 1 in NewVar when "cat" appears in OldVar, 2 for "dog", 3 for "mouse":
do repeat wrd="cat" "dog" "mouse"/val= 1 2 3.
if index(OldVar, wrd)>0 NewVar=val.
end repeat.
This is only good if there will never be a cat AND a dog in the same sentence. If you do have such cases you should go this way:
do repeat wrd="cat" "dog" "mouse"/NewVar=cat dog mouse.
compute NewVar=char.index(OldVar, wrd)>0.
end repeat.
This will create a new variable for each of the possible words, putting 1 in cases where the word appears in OldVar, 0 when it doesn't.
Apparently you have to open a syntax window and enter this command:
COMPUTE newvar=CHAR.INDEX(UPCASE(VAR1),"ABCD")>0
newvar is the name of the new variable.
VAR1 is the name of the variable to be searched.
ABCD is the text to be searched for. NOTE: This must be in CAPITAL letters.
newvar will recieve a value of 1 if the text is found.