I have a column with a lot of data, but I was wondering how can I find things with using key words (keywords below)?
*correction, *change, *legal, *property, terms, vesting, payment, date, interest, corr, amended, chg, address, exhiibit, add, notary, complete,amnt,
in no particular form would I need to find these and I was wondering if I can get maybe just a "Yes" or "No" type of answer.
I prefer formula-only answers when they aren't unwieldy. All the formulas in this answer are array formulas and thus need CTRL + SHIFT + ENTER when exiting the cell. Assuming keywords are in a range named KywrdTbl and the cell you want to search inside for keywords is A1, either of the following will determine whether A1 contains any keywords (1 for Yes; 0 for No):
=MAX(IFERROR(SIGN(FIND(KywrdTbl,$A1)),0))
=MAX(IF(ISERROR(FIND(KywrdTbl,$A1)),0,1))
To convert the output to Yes and No:
=IF(MAX(IFERROR(SIGN(FIND(KywrdTbl,$A1)),0))=1,"Yes","No")
And this will return the number of keywords found:
=SUM(IFERROR(SIGN(FIND(KywrdTbl,$A1)),0))
So let's break down these formulas. The array aspect of the formula allows us to check each keyword independently (and then SUM or MAX their results). FIND returns an array of the character indices of each keyword in A1. SIGN then reduces each of those indices to 1 (since indices can't be negative). Then IFERROR swaps in a 0 for any cases where the keyword was not found. Finally we SUM up the number of 1's or take the MAX to find out if there are any 1's. The alternative in the first code block just replaces IFERROR(SIGN(...),0) with IF(ISERROR(...),0,1) thus explicitly assigning 1 to all found indices as opposed to using the SIGN trick.
Short of an ugly formula, you could use a VBA UDF for this.
Function containsWord(testCell As Range, wordList As String) As Boolean
'loop through incoming list of words
For Each testWord In Split(wordList, ",")
'test to see if word is in the testCell
If InStr(1, LCase(testCell.Value), LCase(testWord)) Then
'if it is set the return to true and exit
containsWord = True
Exit Function
End If
Next testWord
End Function
To use this just create a new module and drop this code in, then you can use this function in a cell. If the cell you are testing for a word is A1 and the list of words is in B1 (comma separated) then:
=containsWord(A1,B1)
Related
I have a excel sheet with values shown as below:
I'm trying to separate the numeric values given in the first column to columns a,b and c so that the final output should look like
Now, I can get the value in the a column using the formula
=LEFT(A1,FIND("x",A1)-1)
But I'm struggling to get the other values (column b and c)
Any help will be appreciated :)
Another formula option
Assume data housed in A2:A5
To find "a", in B2 formula copied down :
=LEFT(A2,FIND("x",A2)-1)
To find "b", in C2 formula copied down :
=LOOKUP(9^9,0+MID($A2,FIND("(",$A2)+1,ROW($1:$9)))
To find "c", in D2 formula copied down :
=LOOKUP(9^9,0+MID($A2,FIND("x",$A2,FIND("(",A2))+1,ROW($1:$9)))
You could probably cook up something to do this using Excel's built-in functionalities, but it'd be long, difficult to grasp and prone to errors if the input format ever changes. Instead, I would probably use VBA to create a simple custom formula that can extract the numbers using a regular expression.
Press Alt + F11 to open the Developer Tools
Go to Tools > References and check Microsoft VBScript Regular Expression 5.5
Choose Insert > Module
Enter the following code:
Function FINDNUMBERS(sInput As String, Optional iIndex As Integer = 0) As String
Dim regEx As New RegExp
Dim strPattern As String
strPattern = "[0-9]+"
With regEx
.Global = True
.Pattern = strPattern
End With
Set Matches = regEx.Execute(sInput)
FINDNUMBERS = Matches(iIndex)
End Function
You can then call the function in your worksheet like this:
=FINDNUMBERS(A1;0)
Where the first parameter is the cell you'd like to get the numbers from, and the second parameter is the position of the number you would like. Enter 0 for the first number found, 1 for the second number, etc.
To find b, try
=--MID(A2,FIND("(",A2)+1,FIND("x",SUBSTITUTE(A2,"x(","[["))-FIND("(",A2)-1)
The logic is to use FIND function to find the position of the second x by replacing/substituting the first x with another symbol which is [ in my example, then you will know the ending point of value b.
Use the FIND function again to find the position of ( then you will know the starting point of value b, the difference between the two will be the length of value b,
then you can use MID function to return value b from the string. The starting point of the MID function is determined by the position of (.
Double minus signs -- in front of the formula is used to turn the value into a numeric value. It is working in the same way as NUMBERVALUE function. This is optional if you do not need to show the result as number.
To find c, try
=--MID(A2,FIND("x",SUBSTITUTE(A2,"x(","[["))+1,FIND(" ",A2)-FIND("x",SUBSTITUTE(A2,"x(","[[")))
The logic is similar to the previous one. Use FIND function to find the position of the space " " and find the position of the second x, the difference of the two positions is the length of value c,
then use MID function to return value c from the string. The starting point of the MID function is determined by the position of the second x.
Replace A2 to suit your case.
I have this function:
MATCH(1,(PositionParameter[[#All],[Position Revised]]=$C94)asterisk(PositionParameter[[#All],[Campus Type Short]]=G$3)asterisk(PositionParameter[[#All],[Campus Num Arbitrary]]=G$1),0))
and I can't figure out what it does. I don't know what the asterisks are for. PositionParameter is the name of the worksheet, Position Revised is the name of a column, Campus Type Short is the name of a column, and Campus Num Arbitrary is the name of a column. There is suppose to be an asterisk between the first PositionParameter() and the second PositionParameter(). There is supposed to be another asterisk between the second PositionParameter() and the third PositionParameter(), but it is rendered as an italic. I took the asterisk out and spelled it out. The tooltip tells me this is suppose to return some sort of array, but I can't figure out its components. Can someone explain the asterisks to me? I would appreciate it.
Thanks,
Howard Hong
Your formula returns a single value - the relative position of the first row in the data where all three conditions are met.
It works like this:
Each of these three conditional statements:
PositionParameter[[#All],[Position Revised]]=$C94
PositionParameter[[#All],[Campus Type Short]]=G$3
PositionParameter[[#All],[Campus Num Arbitrary]]=G$1
.....returns an array of TRUE/FALSE values. Multiplying these three arrays together produces a single array of 1/0 values, 1 when all conditions are met in a row, 0 otherwise. This array forms the "lookup array" of the MATCH function
The "lookup value" is 1 so that value is looked up in the lookup array and the result of the MATCH function is the position of the first 1, which corresponds to the first row where all conditions are satisfied.
If there are no rows which meet all three conditions then the result is #N/A
Note that the zero at the end is the third parameter of the MATCH function - zero menas that an exact match must be found.
This is an "array formula" which needs to be confirmed with CTRL+SHIFT+ENTER
Often you would use this in conjunction with INDEX function to return a value from another column in the first row where conditions are satisfied, e.g. using normal cell references
=INDEX(A:A,MATCH(1,(B:B="x")*(C:C="y"),0))
That formula will return the value from column A in the first row where the two specified conditions are met (col B = "x"and col C = "y")
Well, asterisk could be a multiplication symbol or it could be a wildcard in Match. By the looks of the placement, I'd say it's multiplying data from an array or table.
And, um... I don't know what the asterisks are for but I took the asterisk out and spelled it out? Why would you do that? Was it working before you changed it? Where did you find this formula?
Please read [mcve]. Without sample data or other information about the purpose of the formula, I will take a wild guess:
Paste this into the cell:
=MATCH(1,(PositionParameter[[#All],[Position Revised]]=$C94)*(PositionParameter[[#All],[Campus Type Short]]=G$3)*(PositionParameter[[#All],[Campus Num Arbitrary]]=G$1),0))
. . . and assuming it's supposed to be an array, instead of hitting Enter on that cell:
hit: Ctrl+Shift+Enter to create an array formula.
Besides the link above, here is some other reading & practice for you:
Create an array formula
MATCH function
I think certains applications replace certain symbols (that aren't allowed in the application] with words when copying and pasting from Excel to them, but without more information about what happened, I can't say for sure what happened.
Assuming that the * are real and that the formula is entered as an array formula then it should return an array of 0s and 1s.
The formula is looking for Position Revised=C94 AND Campus Type Short =G3 AND Campus Num Arbitrary = G1
It will return a 1 for each row that matches all these conditions and a 0 for each row that does not.
If no rows match the conditions it will return #N/A
The strings are mixed alphas and digits, but there is always a set of digits at the end of the string.
Any leading digits or digits in the middle of the string should be ignored. I came up with:
Public Function trailing(S As String) As Long
Dim r As String
Dim i As Long
For i = Len(S) To 1 Step -1
If IsNumeric(Mid(S, i, 1)) Then
r = Mid(S, i, 1) & r
Else
Exit For
End If
Next i
trailing = CLng(r)
End Function
It seems to work:
However the user is working on a .xlsx and can't use UDFs. Is there a formula that gets the same results ??Thanks in advance
In cell B1 and copied down:
=--RIGHT(A1,LOOKUP(2,1/(ISNUMBER(--RIGHT(A1,ROW($1:$15)))),ROW($1:$15)))
Simply make sure that the 15 in $1:$15 is going to be larger than the maximum possible number of ending digits for any given string. No array formula entry necessary, and no helper columns necessary.
Alternate version so you don't have to repeat ROW($1:$15):
=--RIGHT(A1,MATCH(TRUE,INDEX(ISERROR(--RIGHT(A1,ROW($1:$15))),),0)-1)
Alright, so bear with me...this is ugly and there could be some opportunity to clean this up (or a better approach altogether utilizing the same concepts) and since you already know that this is a much better candidate for RegEx/VBA, we are in agreement of a better approach, yet under the criteria/restrictions of your question, here we go...
Since you said the length varies between 2 and 12, we can make a set of array formulas in 12 columns, B-M which extracts the RIGHT() of 1, 2, ..., 12 characters accordingly. We wrap this with a VALUE to return an error #VALUE! for non-numerics -- NOTE: This formula is entered by first selecting cells B2:M2, typing the formula, =VALUE(RIGHT(A2, {1,2,3,4,5,6,7,8,9,10,11,12})) and then the obligatory CTRL + SHIFT + ENTER
We then find the position of the first error in COLUMN O with a MATCH() formula looking for the first TRUE of another array formula utilizing ISERROR().
We get the solution by doing an index of the position 1-12 columns returning the column next to the first error it finds. This is because you know that column A will always end in a number.
N.B. Since you specified that it is only a 4-digit numeric string at the end, you can probably get away with reducing this method to the first 4 columns; that is, rather than utilizing 12 columns, just use columns B:E and continue as above.
We end by begging the client/customer to consider a .xlsm workbook/VBA/RegEx or other solution...
Another possible solution:
=AGGREGATE(14,6,--RIGHT(A1,ROW(INDIRECT("1:"&LEN(A1)))),1)
In some cases it could give wrong results:
aaa12.1
bc1E+13
As a workaround you can use SUBSTITUTE to replace E, e and . (maybe also , and space) in input string with another letter.
Say, I got 2 words A1:ddC, A2:DDC
I want to convert these 2 words into a unique code so that so that i can do the Case Sensitive Vlookup.
So i tried, =Code(A1) & it returned 100, but if i tried =Code("dady") then it also returns 100. It is cos =Code() only pic the first char of the word.
I want to convert a word to a Unique Code (could be ASCII code or any form of unique code).
So how to do that without using VBA?
As this is a hash, it would be possible for some strings to end up with the same value, but it would be unlikely.
Note that the row function uses 1:255 to generate a list of numbers from 1 to 255 - change this number if your strings end up longer.
=A1&SUMPRODUCT(IF(IFERROR(CODE(MID(A1,ROW($1:$255),1)),0)>96,1,0),POWER(2,ROW($1:$255)))
This has to be entered as an array formula using CTRL+SHIFT+ENTER - you will see {} around the formula if you have successfully done that.
This will produce a decimal representation of the upper and lower case letters, and this is then appended to the word itself - this will guarantee uniqueness, as the only way to have a word and number match is to have the same word and case, which means it was a duplicate in the first place.
With this, ddC = ddC & 1*2 + 1*4 + 0*8 = ddC6
DDC = DDC & 0*2 + 0*4 + 0*6 = DDC0
ddC (ddC with a space after it) = ddc & 1*2 + 1*4 + 1*8 + 0*16 = ddC 6
*WARNING: * This is not a solution to the titled question
"How to convert a word to a Unique Code in Excel using Formula without using VBA?" but instead is a solution to what I believe is the underling problem as the original question states "so that i can do the Case Sensitive Vlookup." this is a solution acomplishing a Case Sensitive Vlookup, without the need to convert the values before doing so.
An alternative to converting all the values then doing a look up on the converted values, you could use the INDEX and MATCH functions in an array entered formula and directly look up the values:
=INDEX(A1:A14,MATCH(TRUE,EXACT(A1:A14,"ddC"),0))
This will return the value in A1:A14, at the same index of an exact (case-sensitive) match in A1:A14 to ddC you can VERY easily modify this into a look up of other columns.
Explanation:
Start with getting an array of all exact matches in your look up list to your look up value:
So if I enter this formula:
=EXACT(A1:A14,"ddC")
Then go into the formula bar and press F9 it will show me an array of true false values, relating to each cell in the range A1:A14 that are an Exact match to my expression "ddC":
now if we take this Boolean Array, and use the Match function to return the relative position of True in the array.
=MATCH(TRUE,EXACT(A1:A14,"ddC"),0)
But remember we need to enter this by pressing Ctrl + Shift + Enter because we need the EXACT(A1:A14,"ddC") portion of the formula to be returned as an array.
Now that we have the position of the True in the array, in this case 6 we can use that to retrieve the corresponding value in any column, as long as it is relational and that same size. So if we want to return the value of the exact match (although relatively useless in this situation, but will continue for demonstration) in the original look up column we just wrap the last formula up in an Index function:
=INDEX(A1:A14,MATCH(TRUE,EXACT(A1:A14,"ddC"),0))
But remember we need to enter this by pressing Ctrl + Shift + Enter because we need the EXACT(A1:A14,"ddC") portion of the formula to be returned as an array.
Now we can apply that same concept to a larger range for more useful look up function:
But remember we need to enter this by pressing Ctrl + Shift + Enter because we need the EXACT(A1:A14,"ddC") portion of the formula to be returned as an array.
Now notice in this last step I offered 2 formulas:
=INDEX(A1:B14,MATCH(TRUE,EXACT(A1:A14,D2),0),2)
And
=INDEX(B1:B14,MATCH(TRUE,EXACT(A1:A14,D2),0))
The first returns the value in the range A1:B14 in the Second column at the position of the exact match in A1:A14 to the value in D2 (in this case "dady")
The second returns the value in the range B1:B14 at the position of the exact match in A1:A14 to the value in D2 (in this case "dady")
Hopefully someone else can add more input but as far as I know the second might be better performing, as it has a smaller index range, and doesn't require going to the specified column, it is also shorter.
While the first to me is much easier to read, to some (more of a preference I think) because you know that your looking at a look up table that spans 2 columns and that you are returning the value in the second column.
*Notes: * I am note sure if this solution will be better in practice then converting the original values in the first place, seeing as how converting all the values once, then hard coding the converted values will require no additional formula or calculation (if formulas are afterwards replaced with values) once finished, while this method will recalculate, and also is array entered. But I feel in the case the asker is doing a single look up against a changing look up list (one that constantly requires all values are converted at all times using array formula) this option does allow you to remove the formula per word, with one single formula
all in all I hope this solves your original problem,
Cheers!!
if all your strings like the one you pointed above try something like this:
= CONCATENATE(Code(A1) , Code(Mid(A1,2,1)) , Code(Mid(A1,3,1)))
In order to account for capital letters you're going to end up with a VERY long formula, especially if you have long word entries. Without VBA I would approach it this way and set up the formula once to allow for the biggest word you anticipate, and then copy it around as needed.
Formula (to expand):
=CONCATENATE(IF(EXACT(A1,UPPER(A1))=TRUE,"b","s")&CODE(A1),IF(EXACT(A1,UPPER(A1))=TRUE,"b","s")&CODE(MID(A1,2,1)),IF(EXACT(A1,UPPER(A1))=TRUE,"b","s")&CODE(MID(A1,3,1)), . . . )
You can substitue the "b" and "s" with whatever you like. I was just using those for a case check for capital versus lowercase letters (b=big, s=small) and building that into your unique code.
In order to expand this, add additional cases to account for the length of the words you are using by adding this snippet JUST inside the last parenthesis and modifying the "3" in the MID() function to account for a word length of "4", "5", "6", etc.:
IF(EXACT(A1,UPPER(A1))=TRUE,"b","s")&CODE(MID(A1,3,1))
Painful, yes, but it should work.
Cheers.
I have column B that I need to test the length to see if it is longer than 200 characters. If it is longer than 200 characters, I need it to go from right to left and find the occurrence of the semicolon ";" and split the field from the right of the semicolon into column C. Can this be done? Before I was having to do this with 4 columns and have reduced it to one column. Please advise to the best formula to do this.
=IF(LEN(B1)>200,MID(B1,SEARCH("#",SUBSTITUTE(B1,";","#",LEN(B1)-LEN(SUBSTITUTE(B1,";",""))))+1,LEN(B1)),"")
Explanation:
Remove all instances of the delimiter: SUBSTITUTE(B1,";","")
Subtract the length of (1) from the length of the entire string to get the number of occurrences of the delimiter: LEN(B1)-LEN([1])
Substitute the last occurrence of the delimiter with an #: SUBSTITUTE(B1,";","#",[2])
Find the location of the #: SEARCH("#",[3])
Get the substring of everything to the right of the # location: MID(B1, [4] +1,LEN(B1))
Add if condition to only process strings of length > 200: =IF(LEN(B1)>200,[5],"")
I searched the web for formulas for this and concluded that you either need a lot of nesting and a difficult to follow formula or a VBA function. I would suggest using a VBA function such as the one I have written below (FindLast) within a simple formula. Let me know if you need instructions on how to create this VBA formula:
Function FindLast(find_text As String, within_text As Range) As Double
Dim i As Integer
i = Len(within_text.Value) ' start at last character and work back
Do While Mid(within_text.Value, i, 1) <> find_text
i = i - 1
Loop
FindLast = i
End Function
You will then be able to use FindLast within a formula such as the below in C1:
=IF(LEN(B1)>200,MID(B1,FindLast(";",B1)+1,500),"")
UPDATE
The 500 in the above is just a long number I picked to mean the rest of the cell. If there may be more than 500 characters after the final ; then use a larger number. Unfortunately I don't think the MID function allows you to specify that you want to return the rest of the cell. I have put to return nothing if B1 is not >200 characters, let me know if this is not the requirement.