Counting the total number duplicate specific values in 2 columns

Counting the total number duplicate specific values in 2 columns - excel

I am using Office 2007 and have this formula
=SUMPRODUCT(SUBTOTAL(3,OFFSET(K5:K254,ROW(K5:K254)-ROW(K5),0,1)),--(K5:K254="24""")) + SUMPRODUCT(SUBTOTAL(3,OFFSET(O5:O254,ROW(O5:O254)-ROW(N5),0,1)),--(O5:O254="24"""))
and appropriate one for each of the 17", 19", 22" and 23" monitor special row boxes that I need to have an accurate count of.
My problem is that for some reason the above formula will only count from K:K the number of monitors but will not do the same on N:N
I tried
=COUNTIF(K:K,"24""")+COUNTIF(N:N,"24""")-COUNTIFS(N:N,"24""",O:O,"Personal")
but it will get me the circular reference warning error even if at first I do get correct number of monitors, but after the error flashes the value is 0.
My goal is to have a formula that can count from 2 separate columns (K and N) the exact number of company monitors minus the personal ones when I apply a filter in F-S monitors.
My data has 254 names of users with other details and for the monitor evidence data is listed as below:
K column has Monitor1: 17", 19", 22", 23" and 24"
L column has HP, Lenovo, F-S, n/a
N column has Monitor2: 17", 19", 22", 23" and 24"
O column has HP, Lenovo, F-S, n/a, Personal
Your help is very appreciated.

This is the same formula adjusted to apply to N rather than K:
=SUMPRODUCT(SUBTOTAL(3,OFFSET(N5:N254,ROW(N5:N254)-ROW(N5),0,1)),--(N5:N254="24""")) + SUMPRODUCT(SUBTOTAL(3,OFFSET(R5:R254,ROW(R5:R254)-ROW(Q5),0,1)),--(R5:R254="24"""))
Simply adding the two together might be what you want.

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.

You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.

Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.

If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

Using Offset to get to the next instance of a variable?

I am attempting to find the next instance of a variable in order to generate a list base on another variable:
Mkt ID
10 908
15 915
15 416
25 312
25 215
32 482
Similar to the above. There are two drop downs, one for market and one for ID. I want the user to be able to select a market and in the ID drop down have the data validation filter to that list of IDs respective to the market in the first drop down.Let's say the market dropdown is $G$2. Market is Column A, and ID is column B.
Here's the formula I have so far:
OFFSET(ADDRESS(MATCH($G$2,A:A,0),1),0,1,COUNTIFS(A:A,$G$2),1)
This formula references the market, offsets by 0 rows and 1 column, counts the number of that market instance for height, and 1 row in width. I do not see why this is not working. Excel just gives the typical, are you really trying to type a formula? error code.

ADDRESS returns a string that looks like a cell reference. You need INDIRECT to turn that into a real cell reference that OFFSET can use.
=OFFSET(indirect(ADDRESS(MATCH($G$2, A:A, 0), 1)), 0, 1, COUNTIFS(A:A, $G$2), 1)

Google Sheets Arithmetic Search

I have two Google sheets tabs:
I.)
--A-- --B--
--1-- type lessThan10Apart
--2-- Car 1
--3-- Plane 0
II.)
--A-- --B-- --C--
--1-- type sourceA sourceB
--2-- Car 1 100
--3-- Plane 10 100
--4-- Car 2 4
My question is how to create the lessThan10Apart formula above. lessThan10Apart should match up the type from sheet I to sheet II and only count the rows that: Are less than 10 units between A and B. But you can also imagine wanting to do any kind of arithmetic between columns B and C and running a COUNT.
My first attempt is something along the lines of:
=COUNTIFS('sheetII'!A:A),$A2, //Match column A
ABS('sheetII'!C:C-'sheetII'!B:B)<10 //Doesn't work!
)
The problem is that you can't seem to be able to do range calculations like this in COUNTIFS.

For the count (per F4 in supplied image),
=SUMPRODUCT(--(ABS(B2:B4-C2:C4)<10))
For the validSum (sum of absolute difference between B & C; per G4 in supplied image),
=SUMPRODUCT(--(ABS(B2:B4-C2:C4)<10), ABS(B2:B4-C2:C4))
Do not use full column references. Minimize your referenced ranges.
Discard the Car text in E4 in the above image.

Find the top n values in a range while keeping the sum of values in another range under x value

I'd like to accomplish the following task. There are three columns of data. Column A represents price, where the sum needs to be kept under $100,000. Column B represents a value. Column C represents a name tied to columns A & B.
Out of >100 rows of data, I need to find the highest 8 values in column B while keeping the sum of the prices in column A under $100,000. And then return the 8 names from column C.
Can this be accomplished?
EDIT:
I attempted the Solver solution w/ no luck. 200 rows looks to be the max w/ Solver, and that is what I'm using now. Here are the steps I've taken:
Create a column called rank RANK(B2,$B$2:$B$200) (used column D -- what is the purpose of this?)
Create a column called flag just put in zeroes (used column E)
Create 3 total cells total_price (=SUM(A2:A200)), total_value (=SUM(B2:B200)) and total_flag (=(E2:E200))
Use solver to minimize total_value (shouldn't this be maximize??)
Add constraints -Total_price<=100000 -Total_flag=8 -Flag cells are binary
Using Simplex LP, it simply changes the flags for the first 8 values. However, the total price for the first 8 values is >$100,000 ($140k). I've tried changing some options in the Solver Parameters as well as using different solving methods to no avail. I'd like to post an image of the parameter settings, but don't have enough "reputation".
EDIT #2:
The first 5 rows looks like this, price goes down to ~$6k at the bottom of the table.
Price Value Name Rank Flag
$22,538 42.81905675 Blow, Joe 1 0
$22,427 37.36240932 Doe, Jane 2 0
$17,158 34.12127693 Hall, Cliff 3 0
$16,625 33.97654031 Povich, John 4 0
$15,631 33.58212402 Cow, Holy 5 0

I'll give you the solver solution as a starting point. It involves the creation of some extra columns and total cells. Note solver is limited in the amount of cells it can handle but will work with 100 anyway.
Create a column called rank RANK(B2,$B$2:$B$100)
Create a column called flag just put in zeroes
Create 3 total cells total_price, total_value and total_flag
Use solver to minimize total_value
Add constraints
-Total_price<=100000
-Total_flag=8
-Flag cells are binary
This will flag the rows you want and you can grab the names however you want.

How to calculate average of a column of numbers linked to each frequency bin making up a histogram, Excel 2010?

I have three columns. Column A consists of numbers, column B consists of bin ranges, and column C consists of number data relevant to the individual data in column A.
Using columns A and B, I created a frequency histogram where all the data in column A have been grouped into the bins of column B. I would like to calculate the average value of each bin using the data from column C (i.e., calculate a mean value for each bin using data from column C that is associated to each value (from column A) that made up each bin).
Can anybody help?
Thanks for the replies. Here is an example of the data (Unfortunately I can not paste in images):
Below are three columns with headers Jar Type (in volume (ml)), Cookies (he number of chocolate chip cookies in the jar), and Interval for bins (bins to count the jar types):
Jar type-cookies-intervals for bins
500 3 100
500 1 150
500 0.5 200
250 3 250
150 1 300
500 1 350
150 2 400
250 2 450
### # 500
Making a histogram of the frequency of jar types gives this grouping:
Bin-Frequency
100 0
150 2
200 0
250 2
300 0
350 0
400 0
450 0
500 4
More 0
Now what I am trying to do is to find out what is the mean number of cookies that can be found in each type of jar. For example, for the 500ml we know that there are 4x500ml jars, and that in each of the 500ml we have 3+1+0.5+1 = 5.5 cookies in total. the mean would be 1.735 cookies.
My issue is that I have 5000+ numbers that separate into 100 bins.

The question calls for a "wandering trace" of a scatterplot: the values of column A (plot them on the horizontal axis) are placed into bins, which therefore comprise vertical strips in the scatterplot. The values of column C (plotted on the vertical axis) are averaged within each strip. This technique smooths out and summarizes apparent trends in the scatterplot.
In this example with 100 records the original data are in black and computed values are in green. Here is the wandering trace of means:
The open circles plot column C (associated values) against column A (data) while the solid squares, connected with a dashed red trace, plot the bin means (column G) against the midpoints (column F).
Any statistical package will provide functions for grouping data and performing operations on those groups. Excel does this to a limited extent with its SUMIF and COUNTIF functions. To use them, create a column (D in the spreadsheet) showing the grouping factor. (That's a simple lookup in the sorted BINS vector using the VLOOKUP function with its "range" option set to true.) SUMIF computes sums by group factor and COUNTIF counts by group factor. Their ratios are the bin means.
Here is what the formulas look like:
Only three formulas were actually entered and then copied down as needed:
=VLOOKUP(A2, Bins, 1, TRUE) computes the group for the value in cell A2. Bins a name for the array $(-2,-3, \ldots, 3)$ in column B.
=AVERAGE(B3:B4) computes the midpoint of the first bin. This was used as a horizontal plotting position in the scatterplot.
=SUMIF(Bin,"="&B3,NewValues)/COUNTIF(Bin, "="&B3) is where all the work is done. Bin refers to the group codes in column D and NewValues refers to the associated values in column C. The tricky parts are the constructs "="&B3: these form a text value instructing the data to be grouped by comparison to the number in cell B3, which is the first endpoint. Because this is a formula, copying it down automatically updates the B3 to B4, then B5, etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Counting the total number duplicate specific values in 2 columns - excel

This is the same formula adjusted to apply to N rather than K: =SUMPRODUCT(SUBTOTAL(3,OFFSET(N5:N254,ROW(N5:N254)-ROW(N5),0,1)),--(N5:N254="24""")) + SUMPRODUCT(SUBTOTAL(3,OFFSET(R5:R254,ROW(R5:R254)-ROW(Q5),0,1)),--(R5:R254="24""")) Simply adding the two together might be what you want.

Related

Sum of the greatest value in one column, plus the sum of the other values in another column

Using Offset to get to the next instance of a variable?

Google Sheets Arithmetic Search

Find the top n values in a range while keeping the sum of values in another range under x value

How to calculate average of a column of numbers linked to each frequency bin making up a histogram, Excel 2010?

Categories

Resources