Querying a table that CONTAINS wildcards - excel

Short version:
Basically I want to do this, but in excel. Instead of querying a table USING wildcards, I want to query a table that CONTAINS wildcards.
Long version:
I'm creating a spreadsheet to summarize my bank account transactions at the end of each month. I want to organise the transactions in my bank statement into categories like "groceries", "entertainment", "fuel" etc and then sum up the total amount spent on each category.
I have a sheet that acts as a database, with a list of known account names for each category (e.g. under "clothing" I have the names of the accounts of all the clothing stores I go to). I then have another sheet with the first two columns containing transactions (account name, and amount), and then a column for each category. I copy each amount in column 2 into the correct category column using the following formula:
=IF(ISNA(MATCH($B2,database!B:B,0)),"",$C2)
Where column B is the "account name" column from my bank statement, and column C contains the amounts.
This works fine as long as the data in the database worksheet is an exact match. But a lot of the account names are similar e.g. "7elevenl12345", "7eleven836549" etc. How can I add strings with wildcards like "7eleven*" to my database?
Thanks in advance.

You can use SEARCH for all the column B values in B2, although better to restrict the range so I'll use rows 2 to 100
=IF(ISNUMBER(LOOKUP(2^15,SEARCH(Database!B$2:B$100,$B2))),$C2,"")
SEARCH automatically searches for a value within other text so no wildcards required [you should remove wildcards from the database you only need "7ELEVEN" etc.]. If one (or more) of the searches is a match then it will return a number and so will LOOKUP so you can test whether it does or not.
SEARCH function is not case-sensitive, change to FIND if you want the match to be case-sensitive
Explanation:
When you use
=SEARCH(Database!B$2:B$100,$B2)
That returns an "array" the same size as Database!B$2:B$100. For each value in Database!B$2:B$100 you either get a number (if that specific value is found within B2 it's the position of the start of that value) or you get #VALUE! error.
Then when you lookup a "bignum" like 2^15 in that array, i.e.
=LOOKUP(2^15,SEARCH(Database!B$2:B$100,$B2))
That returns the last number found in the array....or #N/A if there are no matches, so using ISNUMBER identifies whether there is at least one match or not.
If you want to see the whole array returned by
=SEARCH(Database!B$2:B$100,$B2)
then put that in a cell and then select that cell, press F2 to select the formula and F9 to see the whole array.
If you have blanks in Database!B$2:B$100 then that's a problem because a blank is always "found" in any value (at position 1) so you can edit the formula to prevent that, i.e.
=IF(ISNUMBER(LOOKUP(2^15,SEARCH(Database!B$2:B$100,$B2)*(Database!B$2:B$100<>""))),$C2,"")
both versions of the formula can be shortened by using COUNT in place of LOOKUP and ISNUMBER, i.e. for that latter version you can use
=IF(COUNT(SEARCH(Database!B$2:B$100,$B2)*(Database!B$2:B$100<>"")),$C2,"")
but that version needs "array entry" - i.e. you need to confirm the formula with the key combination CTRL+SHIFT+ENTER such that the formula is enclosed in curly braces like { and }
Note: 2^15 is used here because it is guaranteed to be a larger number than any number that SEARCH function can return. 2^15 = 32768 but the maximum number of characters in a cell is 1 fewer than that - 32767

You would to change your formula to: =IF(ISNA(MATCH($B2&"*",database!B:B,0)),"",$C2)
I think this is what you're looking for.

Related

use search, find and replace text from one work sheet to other sheet in excel

I am not an expert in Excel but I have been using the format.
I am trying to make a roster of my staff in Excel sheet using formulas.
The 1st sheet I have uses the row name TIME as a priority and fills other columns with staff names.
Now in the 2nd sheet, I want to use a formula in each staff ROW and the column automatically fills with the TIME at each name.
Please help me if possible.
Thank you.
Use INDEX / MATCH combo. Something like this:
=IFERROR(INDEX(Sheet1!$B$3:$B$9,MATCH($A2,Sheet1!C$3:C$9,0)),"OFF DAY")
Syntax:
=INDEX(rng,index_number) — returns the value based on the index number
=MATCH(lookup_value,rng) — returns the relative position of the lookup value
=INDEX(rng,MATCH(lookup_value,rng)) — we know all the arguments except for one — index_number. Since the output of MATCH function is the same as the index_number, we're going to substitute the index_number with MATCH function.
You can almost think of the INDEX/MATCH combo as:
=INDEX/MATCH(rng,search_value)
=IFERROR(value_or_expression_to_evaluate,value_to_return_if_error) — and finally, we're going to wrap our formula with this one, which will return a value if there is no match, which means, in your example, employee is on-leave, on a rest-day, or whatever. Can be =IFNA, too.
Now, to return all values with multiple matched lookup_value just like in your example, where "Tam" have two shifts on Monday, you may use a trailing number with employee names, i.e. "Tam1" and "Tam2" to indicate the number of shift in a given day. Or you may also use a combination of COUNTIF function (to count the number of shift of "Tam") and CONCATENATE to join the results together so that the output is gonna be: 630-1430 / 1230-2200. There are many ways to do it depending on your requirement.
Hope this helps..

Match lookup value lengths for match with beginning of lookup value

In Excel 2013 I have two tables.
The first contains alpha numeric codes that vary in length.
Some examples from first table:
12345.12345
12346-12345
12AB1234
123.123
23456.123
A1234567.012
01234.12345
The second table contains alpha numeric codes I need to match with the beginning of the codes in the first table. Any numeric codes are currently stored as text.
Some examples from second table:
12345
12346
123
23456
A1234567
01234
How do I return a value from a different column in the second table containing any value? And for some context, the return column from the second table contains a description of for the codes.
I did not jet manage to find a solution using vlookup or match.
Also looked at using wildcards, but this only works one way, the wrong way.
The quickest solution, assuming you dont care about letters, is to use a LEFT(FIND( with substitution. If letters need to be excluded, then explanation will need to be provided how the format should be presented.
Solution: =IFERROR(LEFT(A2,FIND(".",SUBSTITUTE(A2,"-","."))-1),A2)
This formula will find the first "." or "-" and present all characters prior to. If none are found, then it will display the full ID.
If letters need to be removed as well, however, it should be noted that the use of some serious substitute nesting, or VBA script will be required.
A1 is the first cell in your column, in B1 write the following:
=LEFT(A1,MATCH(TRUE,ISERROR(VALUE(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1))),0)-1)
press Ctrl+Shift+Enter at the same time (Array Formula)
it will return the first numeric part of your Data
you can copy paste values in column C and compare with the second table
To have the result in Table1 directly in B1 use:
=IFERROR(INDEX(Sheet2!$A$1:$A$4,MATCH(VALUE(LEFT(A1,MATCH(TRUE,ISERROR(VALUE(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1))),0)-1)),Sheet2!$A$1:$A$4,0)),"")
press Ctrl+Shift+Enter at the same time (Array Formula)
It will return the corresponding number from Table2 (sheet2) if matched or "" empty if no match
Change A1:A4 to correspond all your Numbers in Table2 and keep the $ to fix the references when you drag down the formula

I want to use sumproduct with two different tables based on selection

I am working on a statistical model where we use sumproduct to generate forecast values by multiplying coefficients in one table with variables in another. Right now it is being done manually and that is taking time. I would like to automate it but I'm not able to figure this out.
We are using concatenate to identify different rows to use for vlookup. The variable columns are the same in number for both tables. I need to multiply each variable cell respectively in both tables and sum them, hence sumproduct.
this is what I am trying to do
Forecast model 1 sales for product A in phones in USA = sumproduct([variables by year from table 1 for USA for phones], [Variables for USA phone product A model 1 from table 2] )
I hope someone can help me.
Proof of Concept
You will need to update the references to suit your spreadsheet table locations.
In cell E21 use the following and copy right and down as required:
=SUMPRODUCT(INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0),INDEX($F$15:$H$18,MATCH($A21&$C21&$D21&MID(E$20,16,1),$A$15:$A$18,0),0))
This process was simplified because you had a unique ID tag on each of the previous two tables that could be built from the information in the third table. If you ever get into double digit forecast models the MID() function part of the formula will need to be modified. The 16 in the mid function refers to the character location of the number in the forecast model sales header name in Table 3. As such you either need to keep that header format exactly the same or modify the position of the number in the MID() function.
UPDATE 1
Explanation of Formulas
The following formulas were used in this solution:
SUMPRODUCT
INDEX
MATCH
MID
Concatenate
I will start with the assumption that you already understand sumproduct() as you were already using it before you ran into your problem. One thing to note about sumproduct is that it causes array like calculation to occur on the portion within it brackets. In this case we fed it two ranges of equal size. The difficult part was more an issue of determining those ranges.
Using your ID columns as a lookup row we used the match() function to determine which row to use. For the first set of variables we used the following to determine which row to look in:
=MATCH($B21&$A21&$C21,$A$3:$A$12,0)
Match is made up of three arguments inside the brackets:
MATCH(what to look for, where to look, type of match)
What we need to look for in table is a concatenation of various cells in Table 3 to build the ID in Table 1. It could have been written using the full formula:
=CONCATENATE($B21,$A21,$C21)
but the short form using & was used instead:
=$B21&$A21&$C21
Once we had what to look for we needed the range of where to look and supplied the ID column from table 1:
$A$3:$A$12
This now leaves the third and final argument of what type of search to perform. An exact match seemed to be the most appropriate match to perform so the value of 0 was supplied. What match returns is the row within the supplied range. It is relative to the range supplied and not the actual row in the spreadsheet. If it cannot make a match it will return an error instead of a row number.
Now that we know what row we want, we can use this information with the INDEX() function. The INDEX() function is made up of 3 arguments as well with the third argument being optional depending on if a 1D or 2D range is being indexed:
INDEX(Range to work with, 2D Row or 1D Position reference, 2D Column reference)
IN the case we are dealing with for the first table, the range to work with was your list of variables:
$G$3:$I$12
This is a 2D range. As such we need to tell INDEX() both what Row to look in as well as which Columns to look in. For the row to look in, we used the previously discussed MATCH() function. Since we want all columns and not just a specific column we use the value of 0. If Match returns an error, or if a number greater than the number of rows or columns selected is supplied, INDEX() will return an error. Based on the information discussed, the index function would look like:
=INDEX($G$3:$I$12,MATCH($B21&$A21&$C21,$A$3:$A$12,0),0)
You can try entering the above in a cell but it will give you an error. if you select three adjacent cells in the same row and use CONTROL+SHIFT+ENTER when entering the formula, Excel will add {} around the formula and it will be an array formula and should show you the three variables being used.
The same process as described above can be used for determining the second range of variable from Table 2. The only difference here is that the forecast model number was not in a column of its own but instead in the header row surrounded by text. As such the MID() function needed to be used to go into the header row, bypass the surrounding text and pull the model number out so it could be used as part of the CONCATENATION() used for the "what to look for" in MATCH():
=MID(E$20,16,1)
The MID() function work again with three arguments:
MID(Text to look in, which character to start at, how many characters to pull)
So in this case we are looking at the header in E20. Note the lock $ on the row number so the formula is always looking in row 20 no matter how far down it gets copied. It is then going to the 16th character. In this case the character "1" and pulling 1 character. If the header had just been 1 and 2, there would be no need for the MID function and the cell (with proper lock) could have been used.

Excel - How to count unique days in a list a duplicated days

Having a list of days suchs as:
01-giu-16
01-giu-16
01-giu-16
31-mag-16
31-mag-16
31-mag-16
31-mag-16
30-mag-16
I was looking for an excel formula that helps me count the number of unique days in the list (in this example 3)
Moreover I need the count only for the dates which have a specific ID in the next column (for example 1565)
Without any additional criteria, you can achieve the uniqueness count by using
=SUMPRODUCT(1/COUNTIF(A1:A8,A1:A8)), assuming your data are in the range A1:A8.
To evaluate subject to additional criteria (suppose they are in column B), use
{=SUM(--(FREQUENCY(IF(B1:B8=1565,MATCH(A1:A8,A1:A8,0)),ROW(A1:A8)-ROW(A1)+1)>0))}
This is an array formula: use Ctrl + Shift + Return once you're done editing (and don't type the curly braces yourself). Personally though I think this exceeds the reasonable threshold for complexity: I'd be inclined to adopt the first approach on a column that represents an intermediate transformation of your input data.
Lets assume your data is in Column A and it has a header row. So the first data number will actually be in A2. Place this formula in B2 and copy down beside your list. It will generate a list of unique cell numbers from column A. Once you have the list you simply need to use a function to count the side of it.
=iferror(INDEX($A$2:$A$5,MATCH(0,INDEX(COUNTIF($B$1:B1,$A$2:$A$5),0,0),0)),"")
in C2 you can use the following formula to get the number of unique cell numbers
=COUNTA(B2:B9)-COUNTIF(B2:B9,"")
In D2 you can use the following formula to get the count of each unique cell number from your original list. Copy it down as far as you need to go.
=IF(B5="","",COUNTIF($A$2:$A$9,B5))

Two Column Lookup

I have a data set that I want to return an indexed column using two values: a year and a name. Both these values are formatted to general (I also tried text) in my spreadsheet.
In one work sheet I have a like of people:
On the other, I have a table of Years, Names, and a number
I am trying to do a lookup on the joined year and name and return the given number in the second table. For instance 2013Andrew McCutchen would return 8.2, and 2014Andrew McCutchen would return 6.8.
Currently, I only get the #N/a value with the following"
=INDEX('2006 Results'!C2:C556,MATCH($J$1&C3,'2006 Results'!$A$2&$B$556,0))
But, I know a certain value is in the table though because I have tested with an if statement to make sure my spelling is correct. Any guidance would be much appreciated.
I would add a column to the left of the year column as column A to contain the following formula in cell A2 which refers to the year and name column:
=$B2&$C2
You can then use this column in a VLOOKUP formula on the people sheet. Cell J3 would read as follows. Copy this to all cells in the table body.
=VLOOKUP(J$1&$C3,'TheYearSheet'!$A$1:$D$556,4,false)
Job done.
In your match expression, you are comparing one concatenated value $J$1&C3 to another single concatenated value '2006 Results'!$A$2&$B$556. Match expects that second parameter to be a range rather than a single value.
In cases like this, where multiple criteria are required, I prefer to use sumifs rather than index-match, even though the intention is to return a single value. I think =SUMIFS('2006 Results'!$C:$C,'2006 Results'!$A:$A,j$1,'2006 Results'!$B:$B,$c3) will give what you need and should correctly copy to the other cells in that table.

Resources