Excel difference between two columns - excel

I have this question that puzzles me. Two columns of unique text entries in a worksheet all having a number next to each of them.
How can I compare the values for each pair of text and find the ones where the associated numbers are NOT the same.
Not even sure how the output would be. Maybe using Conditional Formatting highlighting the value in the first column where the match in the second one is different???
Thank you for your time.

Let's have some sample table.
+-----+---+-------+---+------------------------------------+
| A | B | C | D | =VLOOKUP(C1;ALL_VALUES;2;FALSE)=D1 |
+-----+---+-------+---+------------------------------------+
| abc | 1 | fasfa | 4 | #N/A |
+-----+---+-------+---+------------------------------------+
| aa | 2 | abc | 1 | TRUE |
+-----+---+-------+---+------------------------------------+
| dd | 3 | dd | 2 | FALSE |
+-----+---+-------+---+------------------------------------+
Where ALL_VALUES is named range of you table (here A1:D3). Formula returns TRUE when match is found, else it return False Or #N/A(You can transform #N/A by IfError function). Then you can filter the result in table based on this or you conditional formatting... Depends on what suits you better ;)

Related

Combining rows of a table using similar cells of one of its columns in Excel

I have a table (1) in Excel, with two columns, in which at the first column (A) there are some numbers and at the second column (B) there are some letters. I want to have a method to make another table (2) from (1) to put different letters at the first column then to put in each row the numbers that were corresponded to letters in table (1).
For example, let the table (1) is:
| A | B |
|---|---|
| 1 | a |
| 1 | b |
| 2 | a |
| 2 | c |
| 3 | b |
| 4 | b |
What is a method in Excel which make the following combination table:
| a | 1 | 2 | |
| b | 1 | 3 | 4 |
| c | 2 | | |
in which letters are in first column and in each row there are the numbers that were in relationship with the row's letter in table (1)?
As per below screenshot use below formula to C1 cell.
=UNIQUE(B1:B6)
And following formula to D2 cell then drag down
=TRANSPOSE(UNIQUE(FILTER($A$1:$A$6,$B$1:$B$6=C1)))

Rank with condition

I'm searching for a formula which could rank a value from a subset of a range.
Let's say Col.A is Departement and Col.B is value.
I want a formula which can rank the value from all the other value of this departement.
I have tried things
{=rank(value,if(myrange=condition,myrange),0)}
Does not work.
I have managed to do the oposite - retrieving the value of a certain rank with :
{=small(if(myrange=condition,myrange),rank i want)}
I don't understand why my first formula fail.
Excpected result would be the rank of the value from it's subset of value which is all cells where the condition is true.
For such scenarios (ranking a subset of data), I find using SUMPRODUCT much easier:
=SUMPRODUCT(($A$2:$A$12=A2)*(B2<$B$2:$B$12))+1
This is for descending order. Result:
Although Excel has a RANK function, there is no RANKIF function to
perform a conditional rank. However, you can easily create a
conditional RANK with the COUNTIFS function. Exceljet
Some sample data:
| Dep | Val |
|-----|-----|
| A | 5 |
| A | 3 |
| A | 6 |
| A | 6 |
| B | 3 |
| B | 8 |
| B | 2 |
| C | 9 |
| C | 5 |
| C | 7 |
Let's put the COUNTIFS in there:
Formula in C2 for descending:
=COUNTIFS($A$2:$A$11,A2,$B$2:$B$11,">"&B2)+1
Formula in D2 for ascending:
=COUNTIFS($A$2:$A$11,A2,$B$2:$B$11,"<"&B2)+1
Drag both down....

Spreadsheet Formula to Sum Values Over A if B is not in a List of Values

I have a table that looks the following:
| A | B | C |
| 40 | 1 | 1 |
| 180 | 2 | 2 |
| 34 | 1 |
| 2345 | 3 |
| 23 | 1 |
| 1 | 2 |
| 4354 | 3 |
| 2 | 2 |
| 343 | 4 |
| 2 | 2 |
| 45 | 1 |
| 23 | 1 |
| 4556 | 3 |
I want to get the sum of all fields in A where B is neither 1 nor 2 or any other value from colum C. This column contains the values of B where values from A should not be considered for the sum.
I do not know which values B might contain, those values are random and could grow larger, I just wanted to make the example small. My current solution is
{=SUMIF(B1:B13,C1:C2,A1:A13)}
so i can set the lines that should be excluded from the sum in column C. Unfortunately, the current solution does not solve my problem but something different -- it sums up the corresponding entries by value in C. My preferred solution would look something like
=SUMIF(B1:B13,"<>{1, 2}",A1:A13)
=SUMIF(B1:B13,"<>"&C1:C2,A1:A13)
if that were possible (it isn't). I would like to have:
a field (with a list, for example) or column where i can put in the values of B that I do not want to be part of the sum over A.
a method that works with Open Office as well as Excel. I prefer an OO solution.
You could use an array formula so that you can multiply each value in A with a condition. That condition can be any valid Excel formula, so, for instance, you could use MATCH to test if the B value occurs in C:
=SUM((A1:A13)*ISNA(MATCH(B1:B13,$C:$C,0)))
The ISNA function returns TRUE when the match fails, which in a multiplication is used as a numerical value 1. FALSE will make the product 0.
Make sure to enter this as an array formula with Ctrl+Shift+Enter

Counting the frequency of combinations of numbers (in excel using VBA)

I want excel to count the FREQUENCY that certain number-letter combinations appear down a column in excel (using vba). All my data goes down one column like this:
Column A (only 1,2,3,4,5,s,f appear)
1
2
s
4
3
s
4
2
f
2
s
2
s
I want to count the number of occasions combinations of (1-s, 2-s, 3-s, 4-s, 5-s) occur, strictly when the number occurs first (is in the higher row). I do not want to count occasions when the s comes before the number (e.g. s-2). I know how to count the number of individual letters/numbers using the countIf function.
I might later want to expand my analysis to look at the occasions that three letter-number combinations (e.g. 2-s-3, 2-s-5)
I am very much a VBA noob.
Try inserting a new column to the right of Column A. Use this formula =A1&A2 and fill it down the column. The values will look like this:
+----------+----------+
| Column A | Column B |
+----------+----------+
| 1 | 12 |
| 2 | 2s |
| s | s4 |
| 4 | 43 |
| 3 | 3s |
| s | s4 |
| 4 | 42 |
| 2 | 2f |
| f | f2 |
| 2 | 2s |
| s | s2 |
| 2 | 2s |
| s | s |
+----------+----------+
Now you can count occurences like you were doing before! :D
Of course, you can expand to three character frequency analysis by making the formula =A1&A2&A3.
Seems possible with COUNTIFS, with 1 to 5 inclusive in C1:G1 and in C2:
=COUNTIFS($A1:$A12,C1,$A2:$A13,"s")
copied across to suit.
You can use the VBA equivalent of this formula
=SUMPRODUCT(--(ISNUMBER(A1:A12)),--(A2:A13="s"))
which looks for number, followed by s in the row below (4 for your sample)
code
MsgBox Evaluate("SUMPRODUCT(--(ISNUMBER(A1:A12)),--(A2:A13=""s""))")

Excel Formula Optimisation

I am no excel expert and after some research have come up with this formula to look at two sets of the same data from different times. It then displays new entries that are in the latest list of data but not in the old list.
This is my formula:
{=IF(ROWS(L$4:L8)<=(SUMPRODUCT(--ISNA(MATCH($E$1:$E$2500,List1!$E$1:$E$2500,0)))),
INDEX(E$1:E$2500,
SMALL(IF(ISNA(MATCH($E$1:$E$2500&$F$1:$F$2500,List1!$E$1:$E$2500&List1!$F$1:$F$2500,0)),
ROW($F$1:$F$2500)-ROW($F$1)+1),ROWS(L$4:L8))),"")}
Are there any optimisation techniques I could employ to speed up the calculation?
As requested
Some example data(link to a spreadsheet):
https://docs.google.com/file/d/0B186C84TADzrMlpmelJoRHN2TVU/edit?usp=sharing
On this scaled down version its more efficent but on my actual sheet with a lot more data it is slowed.
Well, I was playing around a bit and I think that this works the same, and without the first IF statement:
=IFERROR(INDEX(A$1:A$2500,SMALL(IF(ISNA(MATCH($A$1:$A$2500&$B$1:$B$2500,List1!$A$1:$A$2500&List1!$B$1:$B$2500,0)),ROW($B$1:$B$2500)-ROW($B$1)+1),ROWS(F$2:F2))),"")
That part in your sample data:
ROWS(F$2:F2)<=(SUMPRODUCT(--ISNA(MATCH($A$1:$A$2500,List1!$A$1:$A$2500,0))))
As I understood it, it only sees to it that the row number in which the formula is entered is lower than the number of 'new' items, but it doesn't serve any purpose because when you drag the formula more than required, you still get errors instead of the expected blank. So I thought it could be removed altogether (after trying to substitute it with COUNTA() instead) and use an IFERROR() on the part directly fetching the details.
EDIT: Scratched that out. See barry houdini's comment for the importance of those parts.
Next, you had this:
ROW($B$1:$B$2500)-ROW($B$1)+1
-ROW($B$1)+1 always returns 0, so I didn't find any use to it and removed it altogether.
It's still quite long and takes some time I guess, but I believe it should be faster than previously by a notch :)
A relatively fast solution is to add a multi-cell array formula in a column alongside List 2
{=MATCH($A$1:$A$16,List1!$A$1:$A$11,0)}
and filter the resultant output for #N/A.
(Or see Compare.Lists vs VLOOKUP for my commercial solution)
Array formula is slow. When you have thousands of array formula, it will make the speed very slow. Thus the key will be to avoid any array formula.
The following will be my way to achieve it, using only simple formula. It should be fast enough if you only have 2500 rows.
Column F and H are "Keys", created by concatenating your 2 columns (E and F in your original formula)
Assuming the first line of data is on row 3.
Data:
| A | B | | D | E | F | | H |
| index | final value | | ID | exist in Old? | Key (New) | | Key (Old) |
--------------------------------------------------------------------------------
| 1 | XXX-33 | | 0 | 3 | OOD-06 | | OOC-01 |
| 2 | ZZZ-66 | | 0 | 1 | OOC-01 | | OOC-02 |
| 3 | ZZZ-77 | | 1 | N/A | XXX-33 | | OOD-06 |
| 4 | | | 1 | 4 | OOE-01 | | OOE-01 |
| 5 | | | 1 | 2 | OOC-02 | | OOF-03 |
| 6 | | | 2 | N/A | ZZZ-66 | | |
| 7 | | | 3 | N/A | ZZZ-77 | | |
Column E "exist in Old?": test if the new key (Column F) exists in the old list (Column H)
=MATCH(F3, $H$3:$H$2500, 0)
Column D "ID": to increment by one whenever a new item is found
=IF(ISNA(E3), 1, 0)+IF(ISNUMBER(D2), D2, 0)
the 2nd part of ISNUMBER is just for the first row, where just using D2 can cause an error
Column A "index": just a plain series starting from 1 (until the length of new list Column F)
Column B "final value": to find the new key by matching column A to Column D.
=IF(A3>MAX($D$3:$D$2500), "", INDEX($F$3:$F$2500, MATCH(A3, $D$3:$D$2500, 0))
This column B will be the list you want.
If it is still too slow, there exists some dirty tricks to speed up the calculation, e.g. by utilizing a sorted list with MATCH( , , 1) instead of MATCH( , , 0).

Resources