Apply function to an array matching a criterium - excel

perhaps I'm just unable to formulate the question, but I was unable to find any matches for this, however is there a way you can return an array of all the matching cells matching criteria?
Let's say the following example
1 2
|---------------------|------------------|
1| A | B |
|---------------------|------------------|
2| 1 | 2 |
|---------------------|------------------|
3| 1 | 3 |
|---------------------|------------------|
4| 1 | 12 |
|---------------------|------------------|
5| 2 | 8 |
|---------------------|------------------|
Now in C2, I need to find a way to find a MAX value, out of entire B column, for all the cells that have value 1 in column A.
Now this would be a relatively simple array filter in vba, however I'm trying to achieve this by somehow using only excel formulas.
AFAIK, all the methods, like =INDEX() or =VLOOKUP() can only find a single closest (exact) match. Is there however to return an array of all the matching results?
I'd presume it would go something like
=INDEX($A$2:$B$5; MATCH($A$2; $A$2:$A$5; 0); 1)
However once again issue here being, this would stop on the first occurance, rater than go through the entire array.
Probably only thing I can think of is to exhaustively go over each and every number, return in a separate value every occurance (in a matrix) and then add the number, but that seems like way too much of a hassle
Expected result:
1 2
|---------------------|------------------|------------------|
1| A | B | C |
|---------------------|------------------|------------------|
2| 1 | 2 | 12 |
|---------------------|------------------|------------------|
3| 1 | 3 | 12 |
|---------------------|------------------|------------------|
4| 1 | 12 | 12 |
|---------------------|------------------|------------------|
5| 2 | 8 | 8 |
|---------------------|------------------|------------------|

SUMPRODUCT + MAX works for older excel versions too:
=SUMPRODUCT(MAX(($A$1:$A$4=A1)*$B$1:$B$4))

Tested this:
=MAXIFS(B:B,A:A,A1)
Returns your desired result.

Related

Can PCA be used for row reduction

I had a doubt about Principle Component analysis. If the variables are along the row:
delhi| kolkata| up| mp| bihar| assam|
popolation 1.2 | 2.2 | 1.3| 1.4| 2 | 1.1 |
crop a | b | c | a| b | c |
avg temp 1 | 2 | 3 | 4| 5 | 6 |
soil ph 1 | 2 | 1 | 3| 2 | 1 |
And one wants to do PCA to obtain most important uncorrelated variables, can one do that. The idea is not to reduce the columns but rows.
If anyone could explain this concept to me it would be very helpful as my understanding is variables exist only along columns and there are many code examples in python for column dimension reduction using pca. But I am not sure if row reduction is the same thing.
Thanks in advance.

How to loop throgh a set of rows, count.if by each row, then sum the total result?

As an equivalent simplified example of what i intend, there is this worksheet with any sequence of 5 numbers beetween 1-9 each from columns A to E and for many rows:
| A| B| C| D| E|
1 | 1| 5| 6| 8| 9|
2 | 2| 5| 7| 8| 9|
...
50| 1| 3| 4| 6| 7|
Then I want to check for how many combinations of any two numbers occur by each row along all the rows and filling a combination array with the result:
| 1| 2| 3| 4| 5| 6| 7| 8| 9|
1| | | | | | | | | |
2| | | | | | | | | |
3| | | | | | | | | |
4| | | | | | | | | |
5| | | | x| | | | | |
6| | | | | | | | | |
7| | | | | | | | | |
8| | | | | | | | | |
9| | | | | | | | | |
Above, "x" would represent the value of in how many rows there is any occurance of the combination of the numbers 4 and 5.
I achieved my goal easily by VBA code, but wanted to know how to do this by excel-formula, since it generally will be faster.
Just in case anyone one want to check the VBA code that already works for this task:
Sub NPairs()
Dim Rn As Long
Dim Cn As Long
For Nrow = 2 To 10
For Ncol = 2 To 10
If NCol = NRow Then GoTo NextN 'Skip, cause would search the combination of the same numbers.
Rn = Plan2.Cells(NRow, 1).Value2
Cn = Plan2.Cells(1, NCol).Value2
Plan2.Cells(Nrow, Ncol) = NMatch(Rn, Cn)
NextN:
Next
Next
End Sub
Private Function Nmatch(Rnumber As Long, Cnumber As Long) As Long
Lastrow = Plan1.Cells(Plan1.Rows.Count, "A").End(xlUp).Row
M = 0
For R = 2 To Lastrow
For C = 1 To 5
If Plan1.Cells(R, C).Value2 = Rnumber Then
For Cl = 1 To 5
If Plan1.Cells(R, Cl).Value2 = Cnumber Then M = M + 1
Next
End If
Next
Next
Nmatch = M
End Function
This could be fastened by using array or dictionary, I know. What I want to know is if that is possible to do the same, in a more simple way, by excel-formula.
If your concern is speed, then VBA will probably be faster in this case. But here is an idea to do it with formulas only:
Create an intermediate matrix with as many rows as in the source matrix and a column for each number (1 .. 9). Use a formula to indicate whether the corresponding row contains the number identified by the column.
Based on this intermediate matrix, look for the rows which have TRUE for the two numbers of interest.
You can then hide the intermediate matrix if so desired.
Here is how it would look:
The middle matrix is the intermediate one. The formula in G2 is:
=COUNTIF($A2:$E2, G$1)
You can copy it to the other cells of that matrix
The rightmost matrix is the final result. The formula in R2 is:
=IF(R$1=$Q2, COUNTIFS(INDEX($G$2:$O$9, 0, R$1),">1"),
COUNTIFS(INDEX($G$2:$O$9, 0, R$1),">0", INDEX($G$2:$O$9, 0, $Q2), ">0"))
The INDEX function is used to retrieve the appropriate column in the intermediate matrix. The one column in the intermediate matrix is chosen based on the current row (in the final matrix) and the other one is based on the current column. Both must have the value TRUE (in the same row) to be counted.
After your comment, I wrapped the formula in an IF to deal with the case of the main diagonal: in that case the single number must occur more than once in a row for the latter to be counted.
You can download the above sheet from Google docs
=SUM(IF(ISNUMBER(SEARCH("*"&J$1&"*"&$I2&"*",$A$1:$A$50&$B$1:$B$50&$C$1:$C$50&$D$1:$D$50&$E$1:$E$50)),1,IF(ISNUMBER(SEARCH("*"&$I2&"*"&J$1&"*",$A$1:$A$50&$B$1:$B$50&$C$1:$C$50&$D$1:$D$50&$E$1:$E$50)),1,0)))
This is an array formula, while still in the formula bar hit Ctrl + Shift + Enter
Using wildcards with SEARCH() we can look for the numbers within built strings, then reverse the serach order to catch both instances. I build a binary array based of the results and SUM() them.
* equates to any number of any character (can also be 0 characters). Using this we can establish whether the 2 numbers appear anywhere in the 5 positions, this is then flipped to catch if they are in the other order.
Using a similar approach to #trincot, with an intermediate table, but this table would be ten columns with the set of ten pairs of digits from the source table:
Then use Countif() to count the occurrences of the pairs in a separate table:
Using named ranges would make the formulas even simpler.

EXCEL: Return a row value based on the row with highest max value

I've seen some similar questions for this, however none were suited correctly.
I'm wondering if I can return a row cell based on the max value in the same row, but different cell.
So I have this;
| A | B | Date
1| X | 2 | 01/01/17
2| Y | 3 | 17/01/17
3| Z | 4 | 18/01/17
4| X | 2 | 21/01/17
5| Y | 3 | 03/02/17
6| Z | 4 | 03/02/17
7| Z | 4 | 07/03/17
8| Z | 4 | 09/03/17
9| Y | 3 | 13/03/17
So Column A displays a string, and Column B counts how many times that Column A string is repeated. I have another sheet with a row for each month, being 01, 02, 03, 04, etc. I am trying to get the string from Column A, which the highest value in Column B, grouped by each month. So for the above example, the next sheet would look as so;
| A | B
1| X | 2
2| Draw | 1
3| Z | 2
I have been able to achieve the date grouping aspect for similar functions using;
IFS(E:E,D:D,">=" & DATE(A$2,B6,1),D:D,"<=" & DATE(A$2,B6,EOMONTH(B6,0)))
If anyone has any ideas on how I could achieve this, it would be much appreciated!
Edit;
I've managed to figure parts of it out, I have been able to get the most common name (without checking for multiples) using
=OFFSET(A1,MATCH(MAX(Count),Count,0),0)
Now I just need a way to merge that formula with this one;
=IF(AND(Dates >= DATE(2017,9,1), Dates <= DATE(2017,9,EOMONTH(9,0))),)
How do I pass the results of the =IF to the =OFFSET?

Matching a row where two cols have multiple, repetitive values

I'm trying to match two cells in an area that has two columns, each with multiple repetitive values, and simply return something that indicates there is a match row.
I'm doing this in LibreOffice Calc, but I'd like to be able to share it in an Excel spreadsheet if possible.
My spreadsheet search range looks like this:
| A | B | C | D |
1| 1782.87|Eva_Estelle | 496.15|J.B. (LBarneck) |
2| 1782.87|Eva_Estelle | 214.74|Jessica Laity |
3| 1782.87|Eva_Estelle | 57.50|arndtfamily1 |
4| 905.28|A.N. (robertn) | 615.29|rochellemallory2005 |
5| 905.28|A.N. (robertn) | 367.37|Shenazar James Gill |
6| 905.28|A.N. (robertn) | 366.90|pfitzgerald6 |
7| 615.29|rochellemallory2005 | 905.28|A.N. (robertn) |
8| 615.29|rochellemallory2005 | 367.37|Shenazar James Gill |
9| 615.29|rochellemallory2005 | 366.90|pfitzgerald6 |
10| 615.29|rochellemallory2005 | 281.19|John Gill |
11| 615.29|rochellemallory2005 | 242.96|ANGEL Ballamy |
My result/query area looks (should look) like this:
| A | B | C | D |
1| |Eva_Estelle |A.N. (robertn) |rochellemallory2005 |
2|Eva_Estelle | | | |
3|A.N. (robertn) | | | Y |
4|rochellemallory2005 | | Y | |
Where "Y" (or something) indicates that there is a row in the B column of the search area that matches query area $A2(A2,A3,A4,..), and where the same row in col D matches query area B$1(B1,C1,D1,..), etc.
The problem is that both cols B and D in the search area contain repetitive data and the search area rows are sorted by the values in cols A then C, descending. Meaning I can't use Lookup functions(?).
Is it possible to do this with a formula in the query area cells, or if not can someone who understands OO or LibreOffice Calc help me with the code I need to create a user defined formula using their version of macro "basic" (so I can hopefully follow what it's doing)? I'll also try to get it if you use BeanShell, JavaScript, or Python, but I'm most familiar with VBasic.
Insert a header row of labels (I used A>D), select Columns A:D, Insert > Pivot Table..., OK, drag B to Row Fields:, D to Column Fields:, and D to Data Fields:. Change Sum - D to Count, OK, OK.

If vlookup is between 2 dates, return earlier one

I have a list of dates when people submit that they want to sell something and I have a sell window which is every 2 weeks, where those sellers can actually sell.
I want to match the list with all the dates with the list of those selling windows (Selling Windows were on the 13.07. and the 27.07.), however the simple vlookup(using TRUE) returns me something like this:
seller
submitting | selling window (using vlookup from the seller window list)
13.07.2016 | the corresponding selling window should be 13.07. here
14.07.2016 | but 27.07. from here.
14.07.2016
14.07.2016
14.07.2016
18.07.2016
18.07.2016
20.07.2016
20.07.2016
20.07.2016
21.07.2016
21.07.2016
22.07.2016
25.07.2016 | However, vlookup returns 13.07. until here and
27.07.2016 | 27.07. as selling window only from this date onwards.
28.07.2016
28.07.2016
Does anyone know how I can fix this?
This was my idea.
If the exact match succeeds (e.g. for 13/7/16),take the result from the matching row of column B.
If the exact match fails (e.g. for 14/7/16), do an inexact match and take the result from the next row of column B.
=INDEX($B$2:$B$5,IFERROR(MATCH(A2,$B$2:$B$5,0),MATCH(A2,B$2:$B$5,1)+1))
Just for completeness, here is a VLOOKUP formula
=VLOOKUP(A2,$B$2:$B$5,1,TRUE)+14*(A2>VLOOKUP(A2,$B$2:$B$5,1,TRUE))
and another formula using MOD
=IF(MOD(A2-$B$2,14),A2+14-MOD(A2-$B$2,14),A2)
but the last two assume a constant difference of 14 days between sell dates. The first formula is more flexible because it can allow for public holidays etc. if the sell dates are available as a list as stated in the question.
You can use the MATCH function in conjunction with the INDEX function to lookup the values, this will allow you to benefit from the match_type parameter that forms part of the MATCH function.
Here's some information about the match_type parameter:
Match type information
If match_type is 1 or omitted, MATCH finds the largest value that is less than or equal to lookup_value. The values in the lookup_array argument must be placed in ascending order, for example: ...-2, -1, 0, 1, 2, ..., A-Z, FALSE, TRUE.
If match_type is 0, MATCH finds the first value that is exactly equal to lookup_value. The values in the lookup_array argument can be in any order.
If match_type is -1, MATCH finds the smallest value that is greater than or equal to lookup_value. The values in the lookup_array argument must be placed in descending order, for example: TRUE, FALSE, Z-A, ...2, 1, 0, -1, -2, ..., and so on.
(Source: https://support.office.com/en-gb/article/MATCH-function-e8dffd45-c762-47d6-bf89-533f4a37673a)
This means you can utilise the -1 match_type so long as your lookup_array (selling windows) are placed in descending order!
The formula would look something like this:
=INDEX($C$2:$C$3,MATCH(A1,$C$2:$C$3,-1))
Where your selling windows are in C2:C3, your submitting dates are in column A and the formula is in column B, e.g:
| A | B | C |
|------------+------------+-----------------|
1| Submitting | Lookup | Selling Windows |
|------------+------------+-----------------|
2| 13/07/2016 | 13/07/2016 | 27/07/2016 |
3| 14/07/2016 | 27/07/2016 | 13/07/2016 |
4| 15/07/2016 | 27/07/2016 | |
5| 16/07/2016 | 27/07/2016 | |
6| 17/07/2016 | 27/07/2016 | |
7| 18/07/2016 | 27/07/2016 | |
8| 19/07/2016 | 27/07/2016 | |
9| 20/07/2016 | 27/07/2016 | |
10| 21/07/2016 | 27/07/2016 | |
11| 22/07/2016 | 27/07/2016 | |
12| 23/07/2016 | 27/07/2016 | |
13| 24/07/2016 | 27/07/2016 | |
14| 25/07/2016 | 27/07/2016 | |
15| 26/07/2016 | 27/07/2016 | |
16| 27/07/2016 | 27/07/2016 | |
Assuming your dates are in date format - In your vlookup subtract 1 from the date. Your vlookup will then be 2 weeks early.
To correct this you can add 14 to the vlookup result.
=VLOOKUP(D4-1,$F$3:$F$6,1)+14
Where your list dates are in column D and your sale dates are in column F.

Resources