Aggregating records with two main IDs in [VBA macro] - excel

I want to make a macro in Excel that summarizes data from rows that match a composite ID generated from 2 ID columns. In my excel sheet, each row has 2 main ID columns: ID_1 is the main key, and ID_2 is a secondary key from which I only care about the first 2 letters (Which I have gotten using LEFT). I want to group rows with the same ID_1 and first 2 letters of ID_2 and report the SUM of the value, count, and sum columns.
In the example picture below, I want to turn the data in columns A:J into the data in columns M:V
So, with this example -> We have 6 records 1015 (ID_1) with 3 different ID_2 (AB, AZ, AE). I want to sum them up to a one cell each (1015 AB ; 1015 AZ ;1015 AE) with values which each record had (there is 3 records: 1015 AB with VALUE of 2,3,4 so in result I want to get just one row 1015 AB 9(sum of value) 4(sum of count) 17 (sum of(value * count)). It's important to see that this 17 dosn't come from 9 * 4. It's =sum(I4:I6) (but it may be spread out like in 1200 FF example below! I am still trying to sort them both at one time, but I cant get past it..)

Add a helper column in D to combine the ID_1 and the first 2 characters of ID_2. =A4 & LEFT(C4,2). Copy that down then go to L4 and type in:
=+INDEX($D$4:$D$25,MATCH(0,COUNTIF(L$3:L3,$D$4:$D$25),0)
and hold down Ctrl + Shift + Enter to make it an array function. Copy down to get a list of unique combinations, and then split these values into the separate columns.
Finally to pull in the numbers, put this in Q4:
=SUMIFS(E$4:E$25,$A$4:$A$25,M4,$C$4:$C$25,O4 & "*")
and then copy down and across.

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

In Excel how do subtract values from list table when I preform a transaction?

If I have two tables where I have list of things in 1 table and the other table serves as a transaction table. But every time I do a transaction the value of units in the transaction should be subtracted from the lists table. Can anyone please help?
Let's say table 1 looks like this:
A B C
1 Item Start Value Current Value
2 ---- ----------- -------------
3 1 20
4 2 100
5 3 95
and table 2 has transactions recorded as a list of Item numbers in column E and associated value movements in column F, then the formula in C3 should be:
=B3-SUMIF(E:E,A3,F:F)
This formula can then be copied down for the other entries in table 1

I want to get the sum of count of each unique entry from a particular column from google sheet

I have a google spread sheet,I want to get the sum of count of each unique entry from a particular column('Text' in this example).However the entries in the column themselves repeat.
Eg:
Text Count
a 3
b 4
a 8
abd 4
c 1
t 2
abd 5
a 2
v 1
v 67
w 44
I want the output as:
Text Count
a 13
b 4
abd 9
c 1
t 2
v 68
w 44
Under the assumption that you want to get the results in the text column automatically, use a Pivot table in Excel:
Mark your data including the column captions "Text" and "Count"
Go to the "Insert" tab in the ribbon, hit the arrow below and choose "PivotChart and PivotTable"
The cells for the raw data should already be entered. In the lower part of the window choose where you want to get the Pivot table. Then hit Ok.
There should be an area on the right-hand side of the window where you can choose which data you want to evaluate. Choose Text and Count.
There should already be the sum of the Count values. If you want to get a different quantity such as the average, hit "Sum of Count" with the right mouse button and
You need a GROUP BY and a SUM:
SELECT text, sum(count) as count
FROM yourtable
GROUP BY text

How to get the latest date with same ID in Excel

I want to Get the Record with the most recent date as same ID's have different dates. Need to pick the BOLD values. Below is the sample data, As original data consist of 10000 records.
ID Date
5 25/02/2014
5 7/02/2014
5 6/12/2013
5 25/11/2013
5 4/11/2013
3 5/05/2013
3 19/02/2013
3 12/11/2012
1 7/03/2013
2 24/09/2012
2 7/09/2012
4 6/12/2013
4 19/04/2013
4 31/03/2013
4 26/08/2012
What I would do is in column B use this formula and fill down
=LEFT(A1,1)
in column C
=DATEVALUE(MID(A1,2,99))
then filter column B to a specific value of interest and sort by column C to order these values by date.
Edit: Even easier do a two level sort by B then by C newest to oldest. The first B in the list is newest.
Do you need a programmatic / formula only solution or can you use a workflow? If a workflow will work, then how about this:
Construct a pivot table of your data
Make the Rows Labels the ID
Make the Values Max of Date
The resulting table is your answer.
Row Labels Max of Date
1 07/03/13
2 24/09/12
3 05/05/13
4 06/12/13
5 25/02/14

Find the top n values in a range while keeping the sum of values in another range under x value

I'd like to accomplish the following task. There are three columns of data. Column A represents price, where the sum needs to be kept under $100,000. Column B represents a value. Column C represents a name tied to columns A & B.
Out of >100 rows of data, I need to find the highest 8 values in column B while keeping the sum of the prices in column A under $100,000. And then return the 8 names from column C.
Can this be accomplished?
EDIT:
I attempted the Solver solution w/ no luck. 200 rows looks to be the max w/ Solver, and that is what I'm using now. Here are the steps I've taken:
Create a column called rank RANK(B2,$B$2:$B$200) (used column D -- what is the purpose of this?)
Create a column called flag just put in zeroes (used column E)
Create 3 total cells total_price (=SUM(A2:A200)), total_value (=SUM(B2:B200)) and total_flag (=(E2:E200))
Use solver to minimize total_value (shouldn't this be maximize??)
Add constraints -Total_price<=100000 -Total_flag=8 -Flag cells are binary
Using Simplex LP, it simply changes the flags for the first 8 values. However, the total price for the first 8 values is >$100,000 ($140k). I've tried changing some options in the Solver Parameters as well as using different solving methods to no avail. I'd like to post an image of the parameter settings, but don't have enough "reputation".
EDIT #2:
The first 5 rows looks like this, price goes down to ~$6k at the bottom of the table.
Price Value Name Rank Flag
$22,538 42.81905675 Blow, Joe 1 0
$22,427 37.36240932 Doe, Jane 2 0
$17,158 34.12127693 Hall, Cliff 3 0
$16,625 33.97654031 Povich, John 4 0
$15,631 33.58212402 Cow, Holy 5 0
I'll give you the solver solution as a starting point. It involves the creation of some extra columns and total cells. Note solver is limited in the amount of cells it can handle but will work with 100 anyway.
Create a column called rank RANK(B2,$B$2:$B$100)
Create a column called flag just put in zeroes
Create 3 total cells total_price, total_value and total_flag
Use solver to minimize total_value
Add constraints
-Total_price<=100000
-Total_flag=8
-Flag cells are binary
This will flag the rows you want and you can grab the names however you want.

Resources