Microsoft Excel Conditional String Output Formula - excel

I'm trying to create a formula that takes in data formatted thusly:
Ex.
A1 Excel Data --> (Useless Data - 'example')
Then, Grab the data after the 'hyphen' -> MID(A1,FIND("-",A1)+1,1)
Finally, look at the first letter of the selected data and sort it into four Categories -> IF(MID(A1,FIND("-",A1)+1,1)="e""E","Example"
Notes:
Needs to search for both upper and lower-case lettering.
If none of the statements are 'true' it should default to category 'other'
Ideally a cool way to remove formatting such as '(', ' ', ', ' would be cool.
Function so Far:
=IF(MID(A1,FIND("-",A1)+1,1)="a""A","Amex",IF(MID(A1,FIND("-",A1)+1,1)="C""c","Citi Bank",IF(MID(A1,FIND("-",A1)+1,1)="W""w","Wells Fargo", "Other")))

This function will search for both upper and lower case, and will default to other.
=IF(ISNUMBER(SEARCH("a",MID(A1,FIND("-",A1)+1,1))),"Amex",IF(ISNUMBER(SEARCH("c",MID(A1,FIND("-",A1)+1,1))),"Citi Bank",IF(ISNUMBER(SEARCH("w",MID(A1,FIND("-",A1)+1,1))),"Wells Fargo", "Other")))
However, please strongly consider putting a list together that you can reference using an INDEX and MATCH. Consider the following in columns D and E:
D E
-------
a Amex
c Citi Bank
w Wells Fargo
Then just use this formula:
=IFERROR(INDEX(E:E,MATCH(LOWER(MID(A1,FIND("-",A1)+1,1)),D:D,0)),"Other")
If you do need to remove formatting you can look into the SUBSTITUTE formula

Related

Index and Match multiple matches

I need help with the following query. There are 2 excel sheet and I need to find out in Sheet 1 in Column A what are the different accounts matching, the refernce is Sheet 2.
I am looking for a formula, which can give me all the account in sheet 1 in corressponds to the position nr. The anwser is in sheet 2. Can someone please help?
eg. 5001 = should give me 41150100, 41150101, 41200000
Position
Account
5001
5031
5051
Account
Position
41150100
5001
41150101
5001
78589545
5051
I am looking for a formula, which can give me all the account in sheet 1 in corressponds to the position nr. The anwser is in sheet 2. Can someone please help?
eg. 5001 = should give me 41150100, 41150101, 41200000
Assuming no Excel version constraints as per the tags listed in the question, you can try the following (formula 1):
=LET(pos, A2:A4, accnt, B2:B4, REDUCE({"Account","Position"}, pos, LAMBDA(ac,p,
VSTACK(ac,LET(f,TEXTSPLIT(#FILTER(accnt,pos=p),,","), HSTACK(f, IF(f=f, p)))))))
Here is the output:
Notes:
You would need to clean up your input because in some cases the delimiter is just a comma and in other cases, a space is added.
If the question refers to doing it backward, as #ScottCraner suggested in the comments, then assuming the output from the previous screenshot is now the input, then we have (formula 2):
=LET(acc, D2:D8, pos, E2:E8, ux, UNIQUE(pos), out, MAP(ux, LAMBDA(p,
TEXTJOIN(",",,FILTER(acc, pos=p)))), HSTACK(ux, out))
formula 1: Uses the REDUCE/VSTACK pattern, check my answer to the question: how to transform a table in Excel from vertical to horizontal but with different length for more details on how to use it. In this case, we use the header to initiate the accumulator.
TEXTSPLIT is used to split the account information by , into rows. We use implicit intersection (#) to convert the FILTER output (array of one element only) into a single string to be able to use TEXTSPLIT, otherwise, it returns the first element only.
We use the condition IF(f=f, p) to generate a constant array with the position value (p). HSTACK is used to generate the output on each iteration in the format we want (first account, then position).
A more verbose formula, but maybe easier to understand, since it doesn't use the VSTACK/REDUCE pattern, could be the following:
=LET(pos, A2:A4, accnt, B2:B4, split, TEXTSPLIT(TEXTJOIN(";",,accnt), ",",";"),
mult, MMULT(1-ISNA(split), SEQUENCE(COLUMNS(split),,1,0)),
outP, TOCOL(TEXTSPLIT(TEXTJOIN(";",,REPT(pos&",",mult)),",",";",1),2),
HSTACK(TOCOL(split,2), outP))
The main idea is to use TOCOL. The name split, generates the array with the account information. The name mult, calculates the number of columns with values. Now we know how many times we need to repeat position values. We use REPT to repeat the value and generate an array via TEXTSPLIT.

Formula to look for a word within a sentence

Here is the Sample Google sheet file
https://docs.google.com/spreadsheets/d/1B0CQyFeqxg2wgYHJpFxLIzw_8Pv067p0cwacWk0Nc4o/edit?usp=sharing
I have an Excel Sheet where I need to find Arabic Words and separate them.
For example, I have data like this:
//olyservice/GIS-TANSIQ01/Storage/46-أمانة منطقة عسير -بلدية بللحمر/حدود القري المطلوب اعتمادهاالمعتمد مسمايتها بالوزارة.rar
I'm looking for:
1st column: أمانة منطقة عسير
2nd column: بلدية بللحمر
3rd column: RAR
If there is no أمانة and بلدية words, the columns should be blank.
I tried these methods, without success:
=RIGHT(MID(A2,FIND("-",A2,20)+1,255),25)
and
=TRIM(MID(SUBSTITUTE(A2,"",REPT(" ",99)),MAX(1,FIND("-",SUBSTITUTE(A2,"",REPT(" ",99)))+21),99))
Since you specify certain key words to be found, we can look for those key words and then the relevant delimiter, based on your example.
In your example, أمانة is followed by the dash, and بلدية by the slash. (followed by is in terms of the right-to-left orientation of Arabic words).
Try this:
Col1: =MID(A1,FIND("أمانة",A1),FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1) - LEN(SUBSTITUTE(A1,"-",""))))-FIND("أمانة",A1))
Col2: =MID(A1,FIND("بلدية",A1),FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))))-FIND("بلدية",A1))
Col3: =TRIM(RIGHT(SUBSTITUTE(A1,".",REPT(" ",99)),99))
If the keywords are not found, the formula will return an Error. So you can just "wrap" the formula in IFERROR to have it return a blank if the key words are not present.
Edit:
The actual workbook does not have the same pattern as the sample you posted. In particular. Try this for column 2 data:
=MID(A2,FIND("بلدية",A2),99)
or with error suppression:
Col1: =IFERROR(MID(A2,FIND("أمانة",A2),FIND("-",A2,FIND("أمانة",A2))-FIND("أمانة",A2)),"")
Col2: =IFERROR(MID(A2,FIND("بلدية",A2),99),"")
And, the cells that are still returning the #VALUE! error do not have that keyword in the line.
For example:
A6: //olyservice/GIS-TANSIQ01/Storage/103-أمانة منطقة عسير -أحد رفيدة
does not contain بلدية
BTW, those formulas seem to both work on Sheets also.
Edit2:
Since you also posted an example in Sheets, if you can implement this in Sheets, you can use Regular Expressions to account for multiple terminations.
In that case, you would use:
=iferror(REGEXEXTRACT(A2,"(أمانة.*?)\s*(?:[-/\\.]|$)"),"")
or
iferror(REGEXEXTRACT(A2,"(بلدية.*?)\s*(?:[-/\\.\w]|$)"),"")
for the columns.
The regex extracts the pattern that begins with the key phrase, up to the terminator which can be any character in the set of -/\.A-Za-z0-9 or the end of the line. That seems to cover the examples in your sample worksheet, but if there are other terminators, you can add them to the sequence.
In Excel, this would require a VBA UDF to implement the Regex engine.

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Copying all #mentions and #hashtags from column A to Columns B and C in Excel

I have a really large database of tweets. Most of the tweets have multiple #hashtags and #mentions. I want all the #hashtags separated with a space in one column and all the #mentions in another column. I already know how to extract the first occurrence of a #hashtag and a #mention. But I don't know to get them all? Some of the tweets have as much as 8 #hashtags. Manually going through the tweets and copy/pasting the #hashtags and #mentions seem an impossible task for over 5,000 tweets.
Here is an example of what I want. I have Column A and I want a macro that would populate columns B and C. (I'm on Windows &, Excel 2010)
Column A
-----------
Dear #DavidStern, #spurs put a quality team on the floor and should have beat the #heat. Leave #Pop alone. #Spurs a classy organization.
Live broadcast from #Nacho_xtreme: "Papelucho Radio"http://mixlr.com nachoxtreme-radio … #mixlr #pop #dance
"Since You Left" by #EmilNow now playing on KGUP 106.5FM. Listen now on http://www.kgup1065.com  #Pop #Rock
Family Night #battleofthegenerations Dad has the #Monkeys Mom has #DonnieOsman #michaelbuble for me #Dubstep for the boys#Pop for sissy
#McKinzeepowell #m0ore21 I love that the PNW and the Midwest are on the same page!! #Pop
I want Column B to look like This:
Column B
--------
#DavidStern #Pop #Spurs
#mixlr #pop #dance
#Pop #Rock
#battleofthegenerations #Monkeys #DonnieOsman #Dubstep #Pop
#pop
And Column C to look like this:
Column C:
----------
#spurs #heat
#Nacho_xtreme
#EmilNow
#michaelbuble
#McKinzeepowell #m0ore21
Consider using regular expressions.
You can use regular expressions within VBA by adding a reference to Microsoft VBScript Regular Expressions 5.5 from Tools -> References.
Here is a good starting point, with a number of useful links.
Updated
After adding a reference to the Regular Expressions library, put the following function in a VBA module:
Public Function JoinMatches(text As String, start As String)
Dim re As New RegExp, matches As MatchCollection, match As match
re.pattern = start & "\w*"
re.Global = True
Set matches = re.Execute(text)
For Each match In matches
JoinMatches = JoinMatches & " " & match.Value
Next
JoinMatches = Mid(JoinMatches, 2)
End Function
Then, in cell B1 put the following formula (for the hashtags):
=JoinMatches(A1,"#")
In column C1 put the following formula:
=JoinMatches(A1,"#")
Now you can copy just the formulas all the way down.
you could convert text to columns using the other character #, then against for #s and then concatenate the rest of the text back together for column A, if you are not familiar with regular expressions see (#Zev-Spitz)

Return a row number that matches multiple criteria in vbs excel

I need to be able to search my whole table for a row that matches multiple criteria. We use a program that outputs data in the form of a .csv file. It has rows that separate sets of data, each of these headers don't have any columns that are unique in of them self but if i searched the table for multiple values i should be able to pinpoint each header row. I know i can use Application.WorksheetFunction.Match to return a row on a single criteria but i need to search on two three or four criteria.
In pseudo-code it would be something like this:
Return row number were column A = bill & column B = Woods & column C = some other data
We need to work with arrays:
There are 2 kinds of arrays:
numeric {1,0,1,1,1,0,0,1}
boolean {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
to convert between them we can use:
MATCH function
MATCH(1,{1,0,1,1,1,0,0,1},0) -> will result {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
simple multiplication
{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}*{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE} -> will result {1,0,1,1,1,0,0,1}
you can can check an array in the match function, entering it like in the picture below, be warned that MATCH function WILL TREAT AN ARRAY AS AN "OR" FUNCTION (one match will result in true
ie:
MATCH(1,{1,0,1,1,1,0,0,1},0)=TRUE
, YOU MUST CTR+SHIFT+ENTER !!! FOR IT TO GIVE AN ARRAY BACK!!!
in the example below i show that i want to sum the hours of all the employees except the admin per case
we have 2 options, the long simple way, the complicated fast way:
long simple way
D2=SUMPRODUCT(C2:C9,(A2=A2:A9)*("admin"<>B2:B9)) <<- SUMPRODUCT makes a multiplication
basically A1={2,3,11,3,2,4,5,6}*{0,1,1,0,0,0,0,0} (IT MUST BE A NUMERIC ARRAY TO THE RIGHT IN SUMPRODUCT!!!)
ie: A1=2*0+3*1+11*1+3*0+2*0+4*0+5*0+6*0
this causes a problem because if you drag the cell to autocomplete the rest of the cells, it will edit the lower and higher values of
ie: D9=SUMPRODUCT(C9:C16,(A9=A9:A16)*("admin"<>B9:B16)), which is out of bounds
same as the above if you have a table and want to view the results in a diferent order
the fast complicated way
D3=SUMPRODUCT(INDIRECT("c2:c9"),(A3=INDIRECT("a2:a9"))*("admin"<>INDIRECT("b2:b9")))
it's the same, except that INDIRECT was used on the cells that we want not be modified when autocompleting or table reorderings
be warned that INDIRECT sometimes give VOLATILE ERROR,i recommend not using it on a single cell or using it only once in an array
f* c* i cant post pictures :(
table is:
case emplyee hours totalHoursPerCaseWithoutAdmin
1 admin 2 14
1 him 3 14
1 her 11 14
2 him 3 5
2 her 2 5
3 you 4 10
3 admin 5 10
3 her 6 10
and for the functions to check the arrays, open the insert function button (it looks like and fx) then doubleclick MATCH and then if you enter inside the Lookup_array a value like
A2=A2:A9 for our example it will give {TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE} that is because only the first 3 lines are from case=1
Something like this?
Assuming that you data in in A1:C20
I am looking for "Bill" in A, "Woods" in B and "some other data" in C
Change as applicable
=IF(INDEX(A1:A20,MATCH("Bill",A1:A20,0),1)="Bill",IF(INDEX(B1:B20,MATCH("Woods",B1:B20,0),1)="Woods",IF(INDEX(C1:C20,MATCH("some other data",C1:C20,0),1)="some other data",MATCH("Bill",A1:A20,0),"Not Found")))
SNAPSHOT
I would use this array* formula (for three criteria):
=MATCH(1,((Range1=Criterion1)*(Range2=Criterion2)*(Range3=Criterion3)),0)
*commit with Ctrl+Shift+Enter

Resources