Alternative to Max If with formulaArray - excel

I have a worksheet with the following info:
| A | B | C |
| 10 | cat | |
| 15 | cat | |
| 5 | dog | |
| 4 | dog | |
| 11 | dog | |
| 6 | fish | |
| 10 | fish | |
I want to find out which is the maximum value in the A column by grouping them according to the value into the B column. That is, the max value for cat, for dog and for fish.
I was thinking about using the function FormulaArray with Max and If functions:
mysheet.range("C1:C7").FormulaArray="=Max(If(R1C2:R7C2=RC[-1],R1C1:R7C1))"
i tested it but it doesnt work, this formula only compares the first element (B1) with the whole range (B1:B7).
Is there any better answer?

You could use a combination of MAX and INDEX like this:

Array formulas can return single results or multiple results. You want yours to return a single result for each cell in your range, but since you're applying the array formula to the entire range, it is being interpreted as a single formula returning multiple results over the entire range instead of a single formula for each cell.
You can test this by changing something in your array range. Say, change the ",0)" to ",1)" in C3. It will tell you you can't change a part of an array. They're all connected.
What you need to do is loop through the cells in your range and apply the formula to each of them.
Dim r As Range
Dim rFormulas As Range
Set rFormulas = ActiveSheet.Range("C2:C8")
For Each r In rFormulas.Cells
r.FormulaArray = "=MAX(IF(R2C2:R8C2=RC2,R2C1:R8C1,0))"
Next r

Related

Is For loop the best way to check if Cell IsNumeric?

I have to do a VBA code that check if the Cells in Column "A" IsNumber. I work with sheets that have 2k ~ 3k rows per table.
I wanna know if a For loop thru the cells in the range, is the best Optimal way to do this.
Dim tmpCell as Range
For each tmpCell in Range("A1:B5").Columns(1)
If IsNumeric(tmpCell) = True Then
tmpCell = tmpCell.Value*1
End if
next tmpCell
I need to verify if the cell is Number, because when I paste the info from a Pivot Table, one column is copied as string, not number.
Thank you, for your time.
Edit:
My table is something like this.
The numbers in the "A" Column is in string, not number.
| A1 | B1 | C1 |
1| 1234 | 1 | 2 |
2| 5678 | 2 | 4 |
3| 9012 | 3 | 5 |
4|Total | 5 | 11 |

Match column value with substring from reference table in Excel

I have a reference table on sheet1
| A | B |
|---------------|----------|
| dog | 10 |
|---------------|----------|
| cat | 20 |
|---------------|----------|
I then have a list with values on sheet 2
| D | E |
|-------------------|----------|
| wild dog 2 | |
|-------------------|----------|
| strange cat Willy | |
|-------------------|----------|
I would like E to contain the value of B from the reference table, using the first substring match
I tried with VLOOKUP and INDEX ( MATCH ..) but this is not getting me anywhere. Help or pointers appreciated.
With your current sample data following formula will work. But don't know how is your actual data.
=INDEX($B$1:$B$10,MATCH(TRIM(MID(SUBSTITUTE(D1," ", REPT(" ",100)),100,100)),$A$1:$A$10,0))
I ended up using the formula from Harun24HR and simplifying it.
=(INDEX($B$1:$B$10;MATCH(1;COUNTIF(D1;"*" & $B$1:$B$10 & "*");0));

Excel - Return most occurred value based on based on multiple condition

I have three main column Name, Size, and Diameter. What I want is to filter the name and return the most occurred value in Diameter for a particular value in Size. For example I have a table like below :
| Name | Size | Diameter |
------------------------------
| A | 30 | 2232.23 |
| A | 30 | 2232.23 |
| A | 30 | 5382.98 |
| A | 29 | 1123.44 |
| A | 29 | 9323.42 |
| A | 29 | 1123.44 |
| B | 31 | 1232.11 |
| B | 31 | 1232.11 |
| B | 10 | 1111.00 |
------------------------------
The value that I should be receiving from Diameter for A with the Size of 30 is 2232.23 while for B I should be receiving Diameter value of 1232.11 for Size 31
This is just a sample of it. The actual data is more than 9000+ row.
Thanks.
Considering your data is in column A,B,and C you can put this array formula in cell D1
=INDEX(C$1:C$10,MODE(IF(A$1:A$10=A1,MATCH(B$1:B$10,B$1:B$10,{0,0}))))
Don't forget to press Ctrl+Shift+Enter.
Try Paste in cell D2 and drag to the last row:
=COUNTIFS(A:A,A2,B:B,B2,C:C,C2)
It returns the number of occurrences each row.
Use this formula. Formula first creates array of values that pass 2 set conditions. then IF formula removes 0 values from an array. Lastly MODE formula evaluates remained values and return the one with most occurrences.
=SUMPRODUCT(IFERROR(MODE(IF(--($A$3:$A$11000=G2)*($B$3:$B$11000=H2)*$C$3:$C$11000<>0,--($A$3:$A$11000=G2)*($B$3:$B$11000=H2)*$C$3:$C$11000,"")),MAX(--($A$3:$A$11000=G2)*($B$3:$B$11000=H2)*$C$3:$C$11000)))
Enter it using CTRL+Shift+Enter, since it is an array formula.
if you want to show most occurrences in ColumnD then use this formula in cell D3 and drag it to the bottom.
=SUMPRODUCT(IFERROR(MODE(IF(--($A$3:$A$11000=A3)*($B$3:$B$11000=B3)*$C$3:$C$11000<>0,--($A$3:$A$11000=A3)*($B$3:$B$11000=B3)*$C$3:$C$11000,"")),MAX(--($A$3:$A$11000=A3)*($B$3:$B$11000=B3)*$C$3:$C$11000)))
Here is an array formula (click Ctrl + Shift + Enter together) you can try:
=INDEX($C$2:$C$20,MATCH(MODE(IF(($A$2:$A$20=E2)*($B$2:$B$20=F2)*($C$2:$C$20),($A$2:$A$20=E2)*($B$2:$B$20=F2)*($C$2:$C$20),"")),$C$2:$C$20,0),1)
Basically it is using MODE function to find the most frequent occurance and then use INDEX/MATCH to return the value.

Using a number in a cell to generate a cell reference

What I want to do might be better achieved with a database, however I have very limited experience with them, and only have to change the information infrequently.
What I have is a sheet where each row has 8 cells of relevant data.
What I want to do in another sheet is to enter a number into 1 cell, and have a group of cells below use that number to reference data in the other sheet.
For example, in Sheet1 I could have the following (fig 1):
| A | B | C | D | E | F | G | H
-----+-----+-----+-----+-----+-----+-----+-----+-----
101 | Dep | 700 | Sta | 100 | Sta | 300 | Dep | 900
What I want to achieve in sheet 2, by typing the row number into 1 cell, is to have the data in those 8 cells copied below, for example (fig 2):
| A | B | C | D |
-----+-----+-----+-----+-----+
1 | "Row Number" |
-----+-----+-----+-----+-----+
2 | =A# | =B# | =D# | =C# |
-----+-----+-----+-----+-----+
3 | =E# | =F# | =H# | =G# |
-----+-----+-----+-----+-----+
And yes, I am aware those formulae above do not reference the other sheet - this was to save space.
Which, if using the example row above, should look like this (fig 3):
| A | B | C | D |
-----+-----+-----+-----+-----+
1 | 101 |
-----+-----+-----+-----+-----+
2 | Dep | 700 | 100 | Sta |
-----+-----+-----+-----+-----+
3 | Sta | 300 | 900 | Dep |
-----+-----+-----+-----+-----+
So, in that example above (fig 3), what do I need to put in as a formula in cells A2-D2 & A3-D3 to automatically use the number in A1 as part of the cell reference to print the data from the other sheet.
Does that make sense? I hope so because I have over 300 lines to enter into my 1st sheet and another 70 lines x 7 blocks of columns on the second sheet.
Lastly I just want to say I want to avoid programming languages, like VBA, wherever possible.
Check out the INDIRECT() function.
For cell A2 in your example on the second sheet, enter:
=INDIRECT("Sheet1!"&"A"&$A$1)
Expand this formula to the apply to other target cells by changing the "&"A" portion to reference columns B, C, D, etc. from Sheet1 as needed in your grid per the following example:
=INDIRECT("Sheet1!"&"B"&$A$1)
=INDIRECT("Sheet1!"&"C"&$A$1)
=INDIRECT("Sheet1!"&"D"&$A$1)
These formulas will reference your selected "Row Number" in cell A1 and pull the related data from Sheet1 into Sheet2.
You can do this using the INDIRECT function
Returns the reference specified by a text string.
References are immediately evaluated to display their contents.
Use INDIRECT when you want to change the reference to a cell within a
formula without changing the formula itself.
http://office.microsoft.com/en-gb/excel-help/indirect-HP005209139.aspx

Excel Formula Optimisation

I am no excel expert and after some research have come up with this formula to look at two sets of the same data from different times. It then displays new entries that are in the latest list of data but not in the old list.
This is my formula:
{=IF(ROWS(L$4:L8)<=(SUMPRODUCT(--ISNA(MATCH($E$1:$E$2500,List1!$E$1:$E$2500,0)))),
INDEX(E$1:E$2500,
SMALL(IF(ISNA(MATCH($E$1:$E$2500&$F$1:$F$2500,List1!$E$1:$E$2500&List1!$F$1:$F$2500,0)),
ROW($F$1:$F$2500)-ROW($F$1)+1),ROWS(L$4:L8))),"")}
Are there any optimisation techniques I could employ to speed up the calculation?
As requested
Some example data(link to a spreadsheet):
https://docs.google.com/file/d/0B186C84TADzrMlpmelJoRHN2TVU/edit?usp=sharing
On this scaled down version its more efficent but on my actual sheet with a lot more data it is slowed.
Well, I was playing around a bit and I think that this works the same, and without the first IF statement:
=IFERROR(INDEX(A$1:A$2500,SMALL(IF(ISNA(MATCH($A$1:$A$2500&$B$1:$B$2500,List1!$A$1:$A$2500&List1!$B$1:$B$2500,0)),ROW($B$1:$B$2500)-ROW($B$1)+1),ROWS(F$2:F2))),"")
That part in your sample data:
ROWS(F$2:F2)<=(SUMPRODUCT(--ISNA(MATCH($A$1:$A$2500,List1!$A$1:$A$2500,0))))
As I understood it, it only sees to it that the row number in which the formula is entered is lower than the number of 'new' items, but it doesn't serve any purpose because when you drag the formula more than required, you still get errors instead of the expected blank. So I thought it could be removed altogether (after trying to substitute it with COUNTA() instead) and use an IFERROR() on the part directly fetching the details.
EDIT: Scratched that out. See barry houdini's comment for the importance of those parts.
Next, you had this:
ROW($B$1:$B$2500)-ROW($B$1)+1
-ROW($B$1)+1 always returns 0, so I didn't find any use to it and removed it altogether.
It's still quite long and takes some time I guess, but I believe it should be faster than previously by a notch :)
A relatively fast solution is to add a multi-cell array formula in a column alongside List 2
{=MATCH($A$1:$A$16,List1!$A$1:$A$11,0)}
and filter the resultant output for #N/A.
(Or see Compare.Lists vs VLOOKUP for my commercial solution)
Array formula is slow. When you have thousands of array formula, it will make the speed very slow. Thus the key will be to avoid any array formula.
The following will be my way to achieve it, using only simple formula. It should be fast enough if you only have 2500 rows.
Column F and H are "Keys", created by concatenating your 2 columns (E and F in your original formula)
Assuming the first line of data is on row 3.
Data:
| A | B | | D | E | F | | H |
| index | final value | | ID | exist in Old? | Key (New) | | Key (Old) |
--------------------------------------------------------------------------------
| 1 | XXX-33 | | 0 | 3 | OOD-06 | | OOC-01 |
| 2 | ZZZ-66 | | 0 | 1 | OOC-01 | | OOC-02 |
| 3 | ZZZ-77 | | 1 | N/A | XXX-33 | | OOD-06 |
| 4 | | | 1 | 4 | OOE-01 | | OOE-01 |
| 5 | | | 1 | 2 | OOC-02 | | OOF-03 |
| 6 | | | 2 | N/A | ZZZ-66 | | |
| 7 | | | 3 | N/A | ZZZ-77 | | |
Column E "exist in Old?": test if the new key (Column F) exists in the old list (Column H)
=MATCH(F3, $H$3:$H$2500, 0)
Column D "ID": to increment by one whenever a new item is found
=IF(ISNA(E3), 1, 0)+IF(ISNUMBER(D2), D2, 0)
the 2nd part of ISNUMBER is just for the first row, where just using D2 can cause an error
Column A "index": just a plain series starting from 1 (until the length of new list Column F)
Column B "final value": to find the new key by matching column A to Column D.
=IF(A3>MAX($D$3:$D$2500), "", INDEX($F$3:$F$2500, MATCH(A3, $D$3:$D$2500, 0))
This column B will be the list you want.
If it is still too slow, there exists some dirty tricks to speed up the calculation, e.g. by utilizing a sorted list with MATCH( , , 1) instead of MATCH( , , 0).

Resources