How to regexmatch a range? - excel

I am trying to take a data from a table and get the value of how much a class gets a point. I used VLOOKUP to do this, but the problem is that I have to tell the sheets on which class gets how much.
The data:

Your data seems to be setup in a way that unnecessarily complicates things.
kelas-column isn't showing the class, but name and class. For easy use in calculation this would better be divided in two columns: name | class
poins-column seems to be numbers formatted as text (judging by the leading +) if it was showing the number only and the class would show the actual class, a simple SUMIF would solve your problem.
Now it's still doable using SUMPRODUCT:
=SUMPRODUCT(--(A17=RIGHT($B$2:$B$11,2)),--($D$2:$D$11))
The first part checks if the search value A17 equals the last 2 digits in range B2:B11 (the $'s in the formula are to lock the range when dragging the formula down or aside).
This results in an array of TRUE's and FALSE's which is converted to 1's and 0's by the leading --.
The second part simply converts the text values to numbers using the same logic as with the TRUE's and FALSE's, using the --.
SUMPRODUCT multiplies the first array with the second array and adds it all up.
If a condition is true it multiplies the value of the points column by 1 (equals the points), if false it multiplies by 0 (equals 0).
In the end it sums all values meeting given condition.

Related

SUMIF (or something) over a wide range of columns

I've got a massive parts spreadsheet that I'm trying to simplify. Various parts could be included in number of locations, which I would like to add up to a single list. The attached file is just an example using reindeer.
This is doable with using a bunch of SUMIF statements added together, but not practical due to the range of columns I need to include. There's gotta be a better way!?
=SUMPRODUCT(--($D$4:$J$11=$A4),$E$4:$K$11))
SUMPRODUCT can do that. Make sure the second range shifts one column, but has equal count of columns (and rows).
($D$4:$J$11=$A4) results in an array of TRUE's or FALSE's for the value in range $D$4:$J$11 being equal to the value in $A4 (no $ prior to it's row number will increase the row # referenced when dragged down).
Adding -- in front of the array converts the TRUE's and FALSE's to 1's and 0's respectively.
Multiplying that with the range to the right of it will result in 1* the value in $E$4:$K$11 for all TRUE's, which results in it's value, or 0* the value in $E$4:$K$11 for all FALSE's, which results in 0.
Summing the array of values results in the sum of all values where the condition is met in the column left from it.
SUMPRODUCT combines the multiplication of the array and summing the array results to 1 total sum.
You can use simply the SUM:
=SUM((D$4:$D$11=A4)*$E$4:$E$11,($F$4:$F$11=A4)*$G$4:$G$11, etc.)
where in etc you can put any range you want. If you don't use 2021/365 version, you must confirm the formula with CTRL+SHIFT+ENTER.

4 variables index function, with great than and less than for 2 variables

I am trying to use index match functions to determine the appropriate rate for the below table.
So for example a consumer loan that is for a person that owns property, the car is 2 years or less in age and the total loan to value ratio is less than 140% should return a value of 5.15%
I believe this is what you wanted...
I would use a series of nested if functions to evaluate which column of LTV I would want the value to come from.
"That is what is done in the AND( ) part. If the value is greater than the 110% and smaller than 140% let's do the Index Match on the 110% Column, Otherwise do it on the 140% Column."
You could extend this for more columns with more IFs in the false condition.
Then it is a simple INDEX match with concatenation. It searches for the three parameters all concatenated in a single range of concatenations.
Hope it helped.
Proof of Concept
In order to achieve the above I had to make a minor edit to your header to be able to distinguish between the two 140% columns.
The functions used in this answer are:
AGGREGATE function
MATCH function
INDEX function
ROW function
IFERROR function
I placed the main part of the formula inside the IFERROR function as a way of dealing with things that may be out of range or when not all the input have been provided. I then assumed that what you were basing your search on would be provided in a series of cells. In my example I assumed the questions would be asked in the range H3 to K3 and I place the results in L3.
The main concept is centered around the INDEX function. I specified the index range as being the height of your table and the width of the percentage rates. Or for this example D2:F9.
=IFERROR(INDEX($D$2:$F$9,row number, column number),"Not Found")
That is the easy part. That more challenging part is determining the row and column number to look in. Lets start with the column number as it is the slightly easier of the two. I assumed the ratio to look for, or rather the header of the column to look in would be supplied. I basically used this equation to determine the column number:
=MATCH(K3,$D$1:$F$1,0)
which in layman's terms is which column between D and F, counting column D as 1, has the value equal to the contents of K3. So now that there is a formula to determine the column, we can drop that into our original formula and wind up with:
=IFERROR(INDEX($D$2:$F$9,row number,MATCH(K3,$D$1:$F$1,0)),"Not Found")
Now we just need to determine the row number. This is the most complex operation. We are going to basically make a bunch of logical checks and take the first row that matches all the logical checks. The premise here is that a logical check is either TRUE or FALSE. In excel 0 is false an every other integer is TRUE. So if we multiply a series of logical checks together, only the one that is true in all cases will be equal to 1. The first logical check is the loan type. it will be followed by the living status and then the vehicle age.
=(H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)
now if you put that into an array formula you will get a series of true false or 1/0. We are going to use it inside an AGGREGATE function with a special feature. The AGGREGATE function will perform array like calculation for some of its functions. We are going to use function 15 which will do this. We are also going to tell the aggregate function to ignore all errors, which is what the 6 does. So in the end what we wind up doing is dividing each row number by the logical check. If the logical check is false or 0, it will generate a Div/0! error which aggregate will choose to ignore. In the end we wind up with a list of row which match our logical check. We then tell the aggregate that we want the first result with the ,1. so we wind up with a formula that looks like:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)
While this does provide us with the row number we want, we need to adjust it to make it an index number. In order to do this you need to subtract the number of header rows. In this case 1. So the index row number is given by this formula:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1
And when we substitute that back into the earlier equation for the row number, we wind up with the final equation of:
=IFERROR(INDEX($D$2:$F$9,AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1,MATCH(K3,$D$1:$F$1,0)),"Not Found")

Sort Order formula to alphabetise in Excel

I am currently drawing up a spreadsheet that will automatically remove duplicates and alphabetize a list:
I am using the COUNTIF() function in column G to create a sort order and then VLOOKUP() to find the sort in column J.
The problem I am having is that I can't seem to get my SortOrder column to function properly. At the moment it creates an index for two number 1's meaning the cell highlighted in yellow is missed out and the last entry in the sorted list is null:
If anyone can find and rectify this mistake for me I'll be very grateful as it has been driving me insane all day! Many thanks.
I'll provide my usual method for doing an automatic pulling-in of raw data into a sorted, duplicate-removed list:
Assume raw data is in column A. In column B, use this formula to increase the counter each time the row shows a non-duplicate item in column A. Hardcord B2 to be "1", and use this formula in B3 and drag down.
=if(iserror(match(A3,$A$2:A2,0)),B2+1,B2)
This takes advantage of the fact that when we refer to this row counter in our revised list, we will use the match function, which only checks for the first matching number. Then say you want your new list of data on column D (usually I do this for display purposes, so either 'group-out' [hide] columns that form the formulas, or do this on another tab). You can avoid this step, but if you are already using helper columns I usually do each step in a different column - easier to document. In column C, starting in C3 [C2 hardcoded to 1] and drag down, just have a simple counter, which error-checks to the stop at the end of your list:
=if(C2<max(B:B),C2+1," ")
Then in column D, starting at D2 and dragged down:
=iferror(index(A:A,match(C2,B:B,0)),"")
The index function is like half of the vlookup function - it pulls the result out of a given array, when you provide it with a row number. The match function is like the other half of the vlookup function - it provides you with the row number where an item appears in a given array.
Hope this helps you in the future as well.
The actual reason that this is going wrong as implied by Jeeped's comment is that you can't meaningfully compare a string to a number unless you do a conversion because they are stored differently. So COUNTIF counts numbers and text separately.
20212 will give a count of 1 because it is the only (or lowest) number.
CS10Z002 will give a count of 1 because it is the first text string in alphabetical order.
Another approach is to add the count of numbers to the count if the current cell contains text:-
=COUNTIF(INDIRECT("$D$2:$D$"&$F$3),"<="&D2)+ISTEXT(D2)*COUNT(INDIRECT("$D$2:$D$"&$F$3))
It's easier to show the result of three different conversions with some test data:-
(0) No conversion - just use COUNTIF
=COUNTIF(D$2:D$7,"<="&D2)
"999"<"abc"<"def", 999<1000
(1) Count everything as text
=SUMPRODUCT(--(D$2:D$7&""<=D2&""))
"1000"<"999"
(2) Count numbers before text
=COUNTIF(D$2:D$7,"<="&D2)+ISTEXT(D2)*COUNT(D$2:D$7)
999<1000<"999"
(3) Count everything as text but convert numbers with leading zeroes
=SUMPRODUCT(--(TEXT(D$2:D$7,"000000")<=TEXT(D2,"000000")))
"000999" = "000999", "000999"<"001000"

Tolerant average (ignore #NA, etc.)

I want to calculate the average over a range (B1:B12 or C1:C12 in the figure), excluding:
Cells not being numeric, including Empty strings, Blank cells with no contents, #NA, text, etc. (B1+B8:B12 or C1+C8:C12 here).
Cells for which corresponding cells in a range (A1:A12 here) have values outside an interval ([7,35] here). This would further exclude B2:B3 or C2:C3.
At this point, cells in column A may contain numbers or have no contents.
I think it is not possible to use any built-in AVERAGE-like function. Then, I tried calculating the sum, the count, and divide. I can calculate the count (F2 and F7), but not the sum (F3), when I have #N/A in the range, e.g.
How can I do this?
Notes:
Column G shows the formulas in column F.
I cannot filter and use SUBTOTAL.
B8:C8 contain Blank cells with no contents, B9:C9 contain Empty strings.
I am looking for (non-user defined) formulas, i.e., non-VBA.
From
https://stackoverflow.com/a/30242599/2103990:
Providing you are using Excel 2010 and above the AGGREGATE
function
can be optioned to ignore all errors.
=AGGREGATE(1, 6, A1:A5)
        
You can accomplish this by using array formulas based upon nested IFs to provide at least part of the criteria. When an IF resolves to FALSE it no longer process the TRUE portion of the statement.
   
The array formulas in F2:F3 are,
=SUM(IF(NOT(ISNA(B2:B13)), (A2:A13>=7)*(A2:A13<=35)*(B2:B13<>"")))
=SUM(IF(NOT(ISNA(B2:B13)), IF(B2:B13<>"", (A2:A13>=7)*(A2:A13<=35)*B2:B13)))
The array formulas in F7:F8 are,
=SUM(IF(NOT(ISNA(C2:C13)), (A2:A13>=7)*(A2:A13<=35)*(C2:C13<>"")))
=SUM(IF(NOT(ISNA(C2:C13)), IF(C2:C13<>"", (A2:A13>=7)*(A2:A13<=35)*C2:C13)))
Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered correctly, they can be filled down like any other formula if necessary.
Array formulas increase calculation load logarithmically as the range(s) they refer to expand. Try to keep excess blank rows to a minimum and avoid full column references.
You can get the average of your "NA" column values in one fairly simple formula like this:
=AVERAGE(IF(
(
($A$2:$A$13>=$F$2)*
($A$2:$A$13<=$F$3)*
ISNUMBER(B2:B13)
)>0,
B2:B13))
entered as an array formula using CtrlShiftEnter↵.
I find this to be a very clear way of writing it, because all your conditions are lined up next to each other. They're "and'ed" using the mathematical operator *; this of course converts TRUE and FALSE values to 1's and 0's, respectively, so when the and'ing is done, I convert them back to TRUE/FALSE using >0. Note that instead of hard-coding your thresholds 7 and 35 (hard-coding literals is usually considered bad practice), I put them in cells.
Same logic for your sum and your count; just replace AVERAGE with SUM and COUNT, respectively:
=SUM(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
=COUNT(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
though a more succinct formula can also be used for the count:
=SUM(($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))
The same formulas can be used to average/sum/count your "blank" column. Here I just drag-copied them one column to the right (column G), which means that all instances of B2:B13 became C2:C13.

Excel - Return the first negative number of a column

I'm using Excel 2010 and I'm looking for a way to return the first negative number of a column. For instance, I have the following numbers distributed in a column:
1
4
6
-3
4
-1
-10
8
Which function could I use to return -3?
Thanks!
This could be interpreted two ways... If all the numbers are in a single cell (one column) as a string, the MID function can be used. If the numbers are in A1, a formula that could work is this:
=VALUE(MID(A1,SEARCH("-",A1),SEARCH(" ",A1,SEARCH("-",A1))-SEARCH("-",A1)))
If the numbers are each in their own columns (in my example, A3:H3), a different technique must be used:
{=INDEX(A3:H3,1,MATCH(TRUE,A3:H3<0,0))}
Don't type the { } - enter the equation using CTRL+SHIFT+ENTER.
In each case, the formula will return the number -3, which is the first negative number in the series.
Another possibility, avoiding the array formula (which are a big source of performance issues):
=LOOKUP(1;1/(M2:M15<0);M2:M15)
(I assume your numbers are in the M2:M15 range).
This will return the first number matching the "<0" condition. You may use any other condition, including text comparisons.
You may also extract the value of another array corresponding to the matching cell:
=LOOKUP(1;1/(M2:M15<>"OK");T2:T15)
In this example, the first cell containing another string than "OK" will be searched for in the m2:m15 array and the corresponding value in array t2:t15 will be returned.
Please note that the usage of the lookup function should be avoided whenever possible (but in this case, it's very handy !)
(I got the original inspiration for this answer from this post)

Resources