Excel - lookup on one column, result from second column - excel

The first three columns exist. I am trying to create a formula for the fourth (HH_ANALYSIS_FLAG).
ACCOUNT_NUMBER HOUSEHOLD_NUMBER ACCOUNT_ANALYSIS_FLAG HH_ANALYSIS_FLAG
1001 1 1 0
1002 2 0 0
1003 3 1 0
1004 3 0 0
1005 3 0 0
1006 2 0 0
1007 4 0 0
1008 1 1 0
I have 50,000 accounts. They are flagged as being under analysis with the ACCOUNT_ANALYSIS_FLAG column (0,1). All accounts belong to a household. Multiple accounts can belong to the same household. I need the HH_ANALYSIS_FLAG column to evaluate to true or false (0,1) if any account in the same household is under analysis. So with the above data and a working formula, my spreadsheet would look like so:
ACCOUNT_NUMBER HOUSEHOLD_NUMBER ACCOUNT_ANALYSIS_FLAG HH_ANALYSIS_FLAG
1001 1 1 1
1002 2 0 0
1003 3 1 1
1004 3 0 1
1005 3 0 1
1006 2 0 0
1007 4 0 0
1008 1 1 1

The following formula should do the trick. In fact, it will give you the total number of accounts being analysed per household.
A B C D
1 ACC_NUM HH_NUM ACC_ANALYSIS_FLAG HH_ANALYSIS_FLAG
2 1001 1 1 =SUMIF(B$2:B$50001, B2, C$2:c$50001)
3 1002 2 0 =SUMIF(B$2:B$50001, B3, C$2:c$50001)
4 1003 3 1 =SUMIF(B$2:B$50001, B4, C$2:c$50001)
For each row this takes selects the set of rows that share the value in the ACC_NUM column (based on the row conaining the formula) and sums together the values in the corresponding ACC_ANALYSIS_FLAG columns. This gives you the total number of accounts under analysis for the given household. Compare the result to 0 if you only need to use it as a boolean value.
EDIT:
Apparently the performance of this isn't up to snuff. However, assuming the the household numbers are all colocated, it should be possible to speed things up significantly by changin to something like the following.
2 1001 1 1 =SUMIF(B2:B5, B2, C2:C5)
3 1002 2 0 =SUMIF(B2:B6, B3, C2:C6)
4 1003 2 0 =SUMIF(B2:B7, B3, C2:C7)
5 1004 2 0 =SUMIF(B2:B8, B3, C2:C8)
6 1005 2 0 =SUMIF(B3:B9, B3, C3:C9)
7 1006 2 0 =SUMIF(B4:B10, B3, C4:C10)
8 1007 2 0 =SUMIF(B5:B11, B3, C5:C11)
9 1008 2 0 =SUMIF(B6:B12, B3, C6:C12)
10 1009 2 0 =SUMIF(B7:B13, B3, C7:C13)
This assumes that there are at most 4 accounts per household, and thus limits the range of the SUMIF to the current cell +/- 3 rows.
To avoid referencing invalid cells you'll the first and last rows have to be treated as special cases. If you need to generate a single forumala for all of these cells I think it should be possible using the OFFSET in combination with MAX, MIN and ROW to generate the appropriate ranges with just a little arithmatic.

Insert another column D (you can hide it later), which is equal to the household number if it is being analyzed, and zero if it is not. The formula for D2 can be =B2*C2. Fill column D with this formula.
Then for your HH_ANALYSIS_FLAG column, you can count the number of values in column D which match the household in column B. The formula would be like IF(COUNTIF(D:D,"="&B2)>0,1,0).
I'm not sure whether this approach is fast enough for the 50,000 accounts, though.
A B C D E
1 ACCOUNT_NUMBER HOUSEHOLD_NUMBER ACCOUNT_ANALYSIS_FLAG HH_UNDER_ANALYSIS HH_ANALYSIS_FLAG
2 1001 1 1 1 (=B2*C2) =IF(COUNTIF(D:D,"="&B2)>0,1,0)
3 1002 2 0 0 (=B3*C3) =IF(COUNTIF(D:D,"="&B3)>0,1,0)
4 1003 3 1 3 (=B4*C4) =IF(COUNTIF(D:D,"="&B4)>0,1,0)

Presuming your HOUSEHOLD_NUMBER column is column B:
=IF(SUMIF(B:B,C:C)>0,1,0)
should do it.

Kenneth! Try this one:
=IF(VLOOKUP(B2,$B$2:$C$9,2,0)=1,1,0)
Assuming your table starts from A1, which means Account_Number is in cell A1, and your target column "HH_ANALYSIS_FLAG" is in column D.
Hope it's helpful

Related

Finding a formula to compute partial sums for consecutive rows that satisfy a condition

How do I do this is excel. The example is provided below. If we don't sell any products on a specific day I would like to move those hours to the next date.
If you need to compute partial sums of Hours for all the consecutive rows between rows that satisfy the condition that the value Sold is greater than zero, you could do this with an auxiliary column
A B C D
---------------------------------
1 |Hours Sold Sums Solution
2 | 300 30 300 300
3 | 30 0 300 0
4 | 0 0 300 0
5 | 30 0 300 0
6 | 300 50 660 360
7 | 23 0 660 0
8 | 100 25 783 123
Here Sums is defined by a formula for C2
=IF(B2>0,SUM(A2:$A$2),C1)
You can automatically populate the cells below.
This formula puts in a cell a partial sum of Hours up to the current row if Sold is nonzero, otherwise copies the previous partial sum of hours. We need this to subtract this value on the next step.
When you have the column C filled, it is sufficient to put the following formula in D2 and populate the cells below
=IF(ROW(B2)>2,IF(B2>0,C2-C1,0),C2)
This formula handles correctly both D2 that does not have a preceding row with values and the remaining cells in column D.
In fact you could combine the two formulas together and avoid the need to have an auxiliary column. Put the following formula in C2 and spread it down to the rest of the cells in column C
=IF(ROW(B2)>2,IF(B2>0,SUM(A2:$A$2)-SUM(C1:$C$2),0),A2)
to get
A B C
---------------------------------
1 |Hours Sold Solution
2 | 300 30 300
3 | 30 0 0
4 | 0 0 0
5 | 30 0 0
6 | 300 50 360
7 | 23 0 0
8 | 100 25 123

Count rows where two values appear together

My data are in MS Excel:
Col A Col B Col C Col D
1 amy john bob andy
2 andy mel amy john
3 max andy jim bob
4 wil steve andy amy
So, in 4x4 table there are 9 different values.
I need to create table to find how many times each PAIR is occurring in the same ROW. Something like this:
amy andy bob jim john max mel steve will
amy 0
andy 3 0
bob 1 2 0
jim 0 1 1 0
john 2 2 1 0 0
max 0 1 1 1 0 0
mel 1 1 0 0 1 0 0
steve 1 1 0 0 0 0 0 0
will 1 1 0 0 0 0 0 1 0
And I have no clue how to do it...
To reiterate: no duplicated values in each row, each row has unique values, each value in separate cell, so there are column with values and within column values can duplicate.
Any help will be much appreciated!
Assuming your data is in A5:D8 I proceeded like this -
created a helper column with the formula (copied downwards)
=A5&"-"&B5&"-"&C5&"-"&D5
Named this helper column as helper (named range)
listed down and across the unique combinations of names in H4:P4 (across) and G5:G13 (down)
enter this formula in H5 and copy it both downwards and across to fill all 9x9 matrix
=IF($G5=H$4,0,COUNTIFS(helper,"*"&$G5&"*",helper,"*"&H$4&"*"))
Your desired matrix is ready
A detailed blog is available on web for this.

how to change cells color per value next to it

I have the following set of 2 columns and n rows:
A B
1 44000 44000
2 2800 2730
3 17000 21160
4 1000 1046
5 700 0
6 1500 1249
7 300 107
8 1200 400
9 0 1400
10 4500 3582
11 0 280
I would like to create a Conditional Formatting rule for column B, so if value in any row exceeds the associated value in row A the cell becomes red: https://gyazo.com/83a45768c6952f5590448700059179ce
The problem with this approach is I have to modify every single cell and cannot apply the whole Rule for all cells in column B.
If I apply this rule to the set of cells in col A I receive: This type of reference cannot be used in Conditional Formatting formula: http://prntscr.com/feo3c0
The formula:
=$B1>$A1
The Applies to:
=$B:$B

Excel multiple search/match and sum (edit: answered with SUMIFS, COUNTIFS)

I am looking for help to solve this excel problem.
Essentially I want to create a formula for cells in column F which does a multiple search on 3 criteria (on cells in columns A,B,C) and want to access the corresponding column D values where all these (multiple) matches occur, and sum this in column F. I'd also like a count of the amount of matches found to calculate the value in column F; placed alongside in column G.
e.g.
IF col_A_value (anywhere in whole A column) = current_col_A_value +/- 1
AND col_B_value (anywhere in whole B column) = current_col_B_value +/- 1
AND col_C_value (anywhere in whole C column) = current_col_C_value - 1
THEN (output in column F) the sum of all values from row D where this criteria is met
(also, as a seperate but related cell formula, output in column G) the total Count of times this occurs.
Note: the values in columns A,B,C are all integars and the +/- above means to search for any values which are either +1, 0, or -1 different in value. (i.e. this includes the value itself).
e.g. If the value in cell A1 = 10, B1 = 45, C1 = 881, then the first search criteria would look for all other rows with values of 9, 10 or 11 in column A. Then based on these rows, the second search criteria would refine the search to only those rows which also include either a 44, 45 or 46 in column B, and the third search criteria would refine the search again to only include those rows where the column C value is 880.
Next, the values in the column D cells from all of these 'filtered' rows would be summed and the result placed in the column F cell. (The count of these results rows would be put in column G. (seperate formula required))
Since these are all unique entries (think of columns A,B,C creating unique vector coordinates in space), there should be a maximum of 9 entries found and summed. A +/-1: 3 variations, B +/-1: 3 variations and C -1 only: 1 variation. In total: 3x3x1 = 9 unique rows maximum (and potentially none as a minimum, as in the below example).
(If no match is found a value of 0 is good.)
Example with A,B,C,D and E as given values, and column F values calculated (together with the count shown in col G):
A B C D E F G
1 1 1 90 8 0 0
1 2 1 80 6 0 0
1 3 1 70 1 0 0
1 4 1 60 6 0 0
2 1 1 50 1 0 0
2 2 1 40 8 0 0
2 3 1 30 6 0 0
2 4 1 20 8 0 0
3 1 1 10 8 0 0
3 2 1 11 6 0 0
3 3 1 12 1 0 0
3 4 1 13 1 0 0
1 1 2 99 8 260 4
1 2 2 89 6 360 6
1 3 2 79 1 300 6
1 4 2 69 6 180 4
2 1 2 59 1 281 6
2 2 2 49 8 393 9
etc
To illustrate how column F values are calculated here is the working:
260 = 90+80+50+40
360 = 90+80+70+50+40+30
300 = 80+70+60+40+30+20
180 = 70+60+30+20
281 = 90+80+50+40+10+11
393 = 90+80+70+50+40+30+10+11+12
Thanks a lot for any help with this!
These formulas should do what you desire:
F1: =SUMIFS(D:D,A:A,"<="&A1+1,A:A,">="&A1-1,B:B,"<="&B1+1,B:B,">="&B1-1,C:C,C1-1)
G1: =COUNTIFS(A:A,"<="&A1+1,A:A,">="&A1-1,B:B,"<="&B1+1,B:B,">="&B1-1,C:C,C1-1)
The formulas can simply be copied down as you need them...
(Still I don't know what col E is for)

Count values in a range comprehended between two values

At Column A i have this values 1
0
3
2
0
5
1
1
1
0
2
1
1
1
0
2
1
1
1
0
0
3
0
2
0
0
3
1
This list grows everyday.
I need a formula to put on every cell of column B that counts upwards how many values bigger than 1 are until the next value = 1 is found.
In another words i need to count how many values larger than 1 are between 1's.
The pretended result would be something like this:
1
0
3
2
0
5
1 3
1
0
2
1 1
1
0
2
1 1
1
0
0
3
0
2
0
0
3
1 3
Thanks in Advance
I would use a helper column, if this is acceptable.
So to create a running count of numbers greater than one which resets each time it encounters a '1', enter this starting in B2 and pull down (I'm assuming the data has a heading and the list starts with a 1) :-
=IF(A2=1,0,B1+(A2>1))
Then to display the counts at each '1' value (but not for repeated ones) enter this in C2 and pull down:-
=IF(AND(A2=1,A1<>1,ISNUMBER(A1)),B1,"")
It's also possible to do it with an array formula, but not sure if it's worth the effort:-
=IF(AND(A2=1,A1<>1),
COUNTIF(
OFFSET(
A$1,
MAX(ROW(A1:A$2)*(A1:A$2=1))-ROW(A$1)+1,,
MAX(ROW(A1))-MAX(ROW(A1:A$2)*(A1:A$2=1))),
">"&0),
"")
to be entered in B2 with Ctrl Shift Enter and pulled down.

Resources