I'm trying to create a formula for a countback feature in google sheets if possible.
I have a list of competitors race times (W4:W93) and have them ranked (X4:X93) and but if I get Identical times it throws out another formula for overall results.
my sticking point is I can't work out how to compare if the value in any cell is equal to another cell in the same column excluding itself, I have found plenty of info on equal values but not if they are in the same column.
any help would be much appreciated.
I note that this question has been voted down with no explanatory comments. The question is not on hold, it is not closed because it might be duplicate, or answered elsewhere, and no one has edited the question to provide greater clarity. That seems a pretty rough call to me.
In any event, I'm pretty sure that I've been in the situation described by the OP many times and I've had to be creative about finding and removing duplicate values. There are several ways to find/highlight/manipulate duplicate values, though Google doesn't seem to have as many options as Excel. One method is to use conditional formatting and this YouTube video explains it very well.
But I will use one of my all-time absolute favourite formulae to find duplicates. In addition, since the OP is working with 90 competitors, finding the duplicates is only the task, the OP also needs to arrange them in a way that assists analysis.
This Google spreadsheet shows the workings:
First, create a header, say, "Equal Rank" for column Y (the first column after "Ranking") and insert this formula in cell Y4:
=if(countif($W:$W,W4)=1,"Unique","Duplicate")
The formula has two components.
COUNTIF
IF statement
The COUNTIF looks at cell W4 (the race time) in the first row of results. The count range is Column W; the range could just as easily be limited to actual data but the key is that it is expressed as an "absolute" (note the $ signs in the range).The formula counts how many times the value in cell W4 appears in the range.
The IF statement evaluates the result of the COUNTIF. If the result appears only once, then the cell value is "Unique". If the value appears more than once, then it is a "Duplicate" value.
The OP has 90 competitors and will need to drill down to the duplicate results quickly. So, I suggest a variation of the classic formula, this:
=if(countif($W$4:$W$93,W4)=1,"",X4)
The formula still consists of two components, but if the result is unique, then no value is entered; conversely if the result is a duplicate, then the existing "Rank" (cell X4) is returned. BTW, this version shows how the range address would look if you evaluated only the populated range.
Now, sort the range A3:Y93 (I am assuming there is relevant data in columns A-V). FWIW, I inserted a column "Sequence" and gave it a numeric sequence so that once the "Equal Ranks" had been fixed, I could resort on the "Sequence" column to return the data to the original sort order - the OP may or may not need this.
To aggregate the duplicate values:
Select the Range> right-click > Sort range > "Data has Header Row" = checked> 'Sort by' "Equal Rank", Sort order: "A->Z".
From here, Race Times could be adjusted by a tenth, or hundredth, of a second to avoid duplicate times.
Once this is complete, select the full range again, and resort. In my case I used my "Sequence" column to return the data to the original sort order.
Related
I have a problem that seems pretty easy, but still cannot find a proper solution, I want to avoid using vba.
I have two tables in one spreadsheet. both have the same columns - Name, City, Province.
My goal is compare both and if three out of three values in a row match, then pull "1", if not, pull 0.
I have used the formulas below , but it does not work for my case .
=IF(AND(A2=P:P,G2=M:M,H2=L:L),1,0)
=INDEX(A:P,MATCH(A2,P:P,FALSE),MATCH(G2,M:M,FALSE),2)
=INDEX(L:P,MATCH(A5,P:P,0),MATCH(G5,M:M,0),MATCH(H5,L:L,0))
=SUMPRODUCT(--(L2:L60=H2),--(M2:M60=G2),--(P2:P60=A2),B2:B60)
It seems that the solution is quiet simple , but I cannot find it,
Thanks in advance!
The key here is to merge the columns together, them Match on that.
Like this
=IFERROR( IF( MATCH(H3&"_"&I3&"_"&J3, $C$2:$C$60&"_"&$B$2:$B$60&"_"&$A$2:$A$60,0), "Yes"), "No")
Choose a seperator character that doesn't otherwise appear in your data (I've chosne _)
Assumption: Values just need to exist, not that they need to be of equivalent row.
=If(IfError(Match(A2,P:P,0),0)*IfError(Match(G2,M:M,0),0)*IfError(Match(H2,L:L,0),0)>0,1,0)
For each IfError, you will output a row number (>0) if you match, or if there is no match a zero will be output. Multiply anything by zero and you get zero, whcih allows a 1 or 0 output for true/false in the overarching If-statement.
If they need to be of the same row, you can compare 2 matches, which rely on the transitive property (A=B, B=C, so A=C):
=If(And(Match(A2,P:P,0)=Match(G2,M:M,0),Match(G2,M:M,0)=Match(H2,L:L,0)),1,0)
Edit1:
Per my comment (to this answer) about false negatives, a UDF or subroutine in VBA would be more appropriate, considering Match() returns the first row that has a match.
As this is not a VBA tagged post, this is a bit above the expected answer... My recommendation would be to:
A) Ensure you are comfortable using VBA.
B) Make a post about creating a user-defined function (note that any post on here about VBA has an expectation that the poster can interact with an expert on the topic and will be putting forth effort to write the code themselves, as StackOverflow is not a code-for-you service).
To help give a lead on what may be in your UDF:
A loop to go through the values from first row to last row in the search column (i.e., L, M, & P)
A variable to dynamically identify the last row of your search column
An if-statement to compare values from your lookup values (i.e., A2, G2, H2) to the search values at the current iteration of the loop
An output of 1 (has match) or 0 (no match).
There are many ways to go about this with VBA; hopefully that's a good start for you, Irina!
Summary
I need an array formula that takes a row of data of certain length from Sheet1. For that row, in each column that is not blank, I need to grab the Sheet1 header value for that column and display that data in a continuous row on Sheet2 (without any spaces in between the row's cells).
Background
I have a table of data (employees and industry certifications with expiration date being the table's cell data) on sheet 1, with a row for each employee the spreadsheet is tracking. The certifications are the columns.
We are using this information to link to ID Badge Printer software (Bodno Silver), where we are limited to linking columns of data to a particular textbox.
The problem lies in the fact that not everyone has every certification. The rows are peppered with blanks separating the certifications that each employee does have. While setting up the required text boxes in the badge software template, that each link to a specific column, I quickly realized that since not everyone has every certification if we used the data how it was we would have a bunch of strange looking blanks in between the listed certifications rather than a continuous list.
What I did
My solution to this (which I'm open to a better one if anyone knows of one, other than "use better software"), was to create a new sheet and array formulas that no one would use except for me and the id printer software. This sheet would have a similar data table that took the rows of data interspersed with blank cells between expiration dates, and put the matching column headers for cells that had a date in them into a continuous row of the same maximum length (eliminating the blank cells).
Essentially, this would allow me to circumvent the restrictions of the badge software and each textbox would be MatchedCert1, MatchedCert2, MatchedCert3, etc. up to the original maximum number of certifications.
Pictures are probably better than my words at explaining what I am going for:
Sheet1 (source)
Sheet2 (result)
The array formulas
I worked on this one for a while. What I thought would be a simple INDEX, MATCH, ISBLANK formula (that I could create using the appropriate relative and absolute cell linking) and then expand to the whole sheet turned into a witch hunt and me praying for forgiveness for my sins to all that may be holy. Also a lot of googling.... I realized quickly that this one may not be so simple after all.
Finally, I arrived at the following two array formulas in order to correctly show what I was going for:
First Column of training section
{=IFERROR(INDEX(Sheet1!$E$2:$P3,1,MATCH(FALSE,ISBLANK(Sheet1!E3:Q3),0)),"")}
(easy enough, right? I thought so...)
I felt good about this until I tried to think through what would be required to get the formula to be universal so that I could use it on the entire table.
I feel dirty just putting the following in public, but here goes...
Second column through last column array formula
{=IFNA(INDEX(INDIRECT(ADDRESS(ROW($E$2),(MATCH(E3,Sheet1!$2:$2,0)+1),1,1, "Sheet1")&":"&ADDRESS(ROW(E3),COLUMN($Q3),1)),1,MATCH(FALSE, ISBLANK(INDEX(INDIRECT("Sheet1!"&ADDRESS(ROW(E3),(MATCH(E3,Sheet1!$2:$2,0)+1),1)&":"&ADDRESS(ROW(E3),COLUMN($Q3),1)),0,0)), 0)),"")}
(please don't call the police...)
[ninja edit] While this array formula works for 2nd result column through the final column, it doesn't work if there's not a blank column following the result range. The actual spreadsheet has 4 different groups of certifications that run horizontally, but I was able to just add a blank column in the corresponding data from the other sheet easily enough, so I just let it go. I'd give somebody a nickle for the answer to why that's the case here too [/edit]
Results
The first array formula, and INDEX MATCH using ISBLANK is rather straightforward.
The biggest question for me here, and the thing that drove me absolutely nuts for a couple of days, is why the second array formula requires the additional INDEX function nested inside of the ISBLANK function.
While taking the function apart and experimenting I realized that if I have any INDIRECT reference inside a ISBLANK function, which is itself inside of a MATCH function, the result of the match was ALWAYS 1:
{=MATCH(FALSE,ISBLANK(INDIRECT("$E3:$Q3")), 0)}
The above ALWAYS returns 1, whereas if I put the range in explicitly, the function would work just fine. That wasn't an option for me, since I needed to dynamically return the starting position for the match using the previous cell's address.
However, adding an INDEX function (with a column and row value of 0) to encapsulate the INDIRECT function provides the correct answer. I figured this out just by trial and error.
Questions
Can someone with more knowledge please let me know what is causing this behavior?
As a broader question, given I am limited to using formulas (no VBA), I would also like to know if I'm going about this in the wrong way or if there is a much simpler way of accomplishing this without this behemoth of a formula?
I know this sheet will probably require maintenance in a year - good luck future self!
Put this in E3, Copy over and down
=IFERROR(INDEX(Sheet1!$2:$2,AGGREGATE(15,6,COLUMN(INDEX($E:$P,MATCH($C3,Sheet1!$C:$C,0),0))/(INDEX(Sheet1!$E:$P,MATCH($C3,Sheet1!$C:$C,0),0)<>""),COLUMN(A:A))),"")
As to why your formula is not working, it is too convoluted to parse. One note, unless the sheets is the variable, one should avoid INDIRECT as much as possible. INDEX can almost always be used in its place.
Both INDIRECT and ADDRESS are volatile functions. Volatile functions will re-calculate every time Excel re-calculates, leading to a lot of unnecessary computations.
Not a solution but to answer why you are seeing this behavior:
EDIT: PREVIOUS EXPLANATION WAS JUST PLAIN WRONG
This confused me so, I did a bit of investigation:
I think that your problem is actually coming from the ISBLANK function because it is intended to be used with single values, and cannot handle ranges. Any BLANKs which are returned by functions are only converted to numeric values (0), when the BLANK is returned to (or displayed on) the sheet. If the function is returning to another function, the BLANK value seems to be preserved.
EDIT: ADDING A SOLUTION WITHOUT ARRAY FORMULAS
This is probably more complex than using an array formula... but I strongly dislike them, so do all I can to remove them.
Firstly, I would add an index to your positions in the results sheet:
=IF(F$7>COUNTIFS($F3:$L3,"<>"),
"",
IF(
MINIFS(
$F$7:$L$7,$F$7:$L$7,
">" & IFNA(INDEX($F$7:$L$7,MATCH(E9,$F$2:$L$2,0)),0),
$F3:$L3,
"<>"
)=0,
"",
INDEX(
$F$2:$L$2,
MATCH(
MINIFS(
$F$7:$L$7,$F$7:$L$7,
">" & IFNA(INDEX($F$7:$L$7,MATCH(E9,$F$2:$L$2,0)),0),
$F3:$L3,
"<>"
),
$F$7:$L$7,
0
)
)
)
)
Basically, the formula looks at the cert in the previous cell, and looks for the next, minimum index, greater than that.
This is what I am trying to figure out:
IF date in cell matches dates in range
and
If name in cell matches names in range
then
count/sum the number of unique ID#s
This is the formula I have:
=IF(Data!A:A=E10,(IF(Data!D:D=D11,(IF(Data!D:D=D11,SUM(IF(FREQUENCY(Data!C:C,Data!C:C)>0,1)),"ERROR3")),"ERROR2")),"ERROR1")
It does not output the correct info. It either counts all the unique IDs or it Errors out when it should have a result.
I hope I am on the right track, thank you for any help.
Sample dataset:
Try it as,
=SUMPRODUCT(SIGN((B$2:B$10>=E2)*(B$2:B$10<=F2))/
(COUNTIFS(B$2:B$10, ">="&E2, B$2:B$10, "<="&F2, A$2:A$10, A$2:A$10)+(B$2:B$10<E2)+(B$2:B$10>F2)))
First let me say that the question was pretty confusing before you posted an image of the data, as it appears that the term "dates in range" was completely misleading. In fact you are trying to match exact dates, not "ranges of date".
FREQUENCY is useful to detect the first appearance of an item in a column, but unfortunately, this "artificial trick" is not flexible enough to be mixed easily with other criteria, and most importantly FREQUENCY is not array friendly.
There's another method to achieve you goal, which is:
=SUMPRODUCT(((Data!$A$1:$A$24=E$10)*Data!$C$1:$C$24=$D11))/
COUNTIFS(Data!$A$1:$A$24,Data!$A$1:$A$24,Data!$B$1:$B$24,Data!$B$1:$B$24,Data!$C$1:$C$24,Data!$C$1:$C$24))
You can enter this formula in E11 in your sample image and copy/paste in the whole matrix.
The denominator of the formula (the second line) generates an array that counts for each row the number of duplicates.
The numerator sets the criteria. Since each successful row will repeat as many times in the numerator and in the denominator, each matching row will be counted for a total of one.
As a result, we obtain the number of "unique rows" that match the criteria.
The formula should not use complete columns such as A:A etc, make the effort to limit it to a reasonable number of rows, say A1:A999 or so. Complex formulas involving arrays must avoid as much as possible entire columns.
EDIT: I have revived the source data source to remove the ambiguity of my last screen shots
I am trying to transpose spreadsheet data where there are many rows where the customer name may be duplicated but each row contains a different product.
For instance
revised original data source
to
revised proposed data format
I would like to do it with formulae if possible as I struggle with VB
Thank you for any help
I realise this is a huge answer, apologies but I wanted to be clear. If you need anything from me, drop me a comment and I'll help out.
Here's the output from my formula:
EDITED ANSWER - Named ranges used for ease of understanding:
These are just an example of a few of the named ranges I have used, you can reference the ranges directly or name them yourself (simplest way is to highlight the data then put the name in the drop down next to the formula bar [top left])
Be wary that as we will be using Array formulas for AccNum and AccType, you will not want to select the entire column and instead opt for either the exact data length or overshoot it by 100 or so. Large array formulas tend to slow down calculation and will calculate every cell individually regardless of it being empty.
First formula
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Account Number ",LEFT((COLUMN(A:A)+1)/2,1)),"")
This formula is identical to the one in the original answer apart form the adjusted heading title.
=IF(Condition,True,False) - There are so many uses for the IF logic, it is the best formula in Excel in my opinion. I have used to IF with COUNTIF to check whether there is more than 0 cells that are more than BLANK (or ""). This is just a trick around using ISBLANK() or other blank identifiers that get confused when formula is present.
If the result is TRUE, I use CONCATENATE(Text1,Text2,etc.) to build a text string for the column header. ROW(1:1) or COLUMN(A:A) is commonly used to initiate an automatically increasing integer for formulas to use based on whether the count increase is required horizontally or vertically. I add 1 to this increasing integer and divide it by 2 so that the increase for each column is 0.5 (1 > 1.5 > 2 > 2.5) I then use LEFT formula to just take the first digit to the left of this decimal answer so the number increases only once every 2 columns.
If the result is FALSE then leave the cell blank ,""). Standard stuff here, no explanation needed.
Second Formula
=CONCATENATE(INDEX(Forename,MATCH(Sheet4!$A2,Reference,0)))
=CONCATENATE(INDEX(Surname,MATCH(Sheet4!$A2,Reference,0)))
CONCATENATE has only been used here to force blank cells to remain blank when pulled by INDEX. INDEX will read blank cells as values and therefore 0's whereas CONCATENATE will read them as text and therefore "".
INDEX(Range,Row,Column): This is a lookup formula that is much more advanced than VLOOKUP or HLOOKUP and not limited in the way that they are.
The range i have used is the expected output range - Forename or Surname
The row is then calculated using MATCH(Criteria,Range,Match Type). Match will look through a range and return the position as an integer where a match occurs. For this I have set the criteria to the unique reference number in column A for that row, the range to the named range Reference and the match type as 0 (1 Less than, 0 Exact Match, -1 Greater than).
I did not define a column number for INDEX as it defaults to the first column and I am only giving it one column of data to output from anyway.
Third Formula
Remember these need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter)
=IFERROR(INDEX(AccNum,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0))),"")
=IFERROR(INDEX(AccType,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))),"")
As you can see, one of these is used for AccNum and the other for AccType.
IFERROR(Value): The reason that this has been used is that we are not expecting the formula to always return something. When the formula cannot return something or SMALL has run out of matches to go through then an error will occur (usually #VALUE or #NUM!) so i use ,"") to force a blank result instead (again standard stuff).
I have already explained the INDEX formula above so let's just dive in to how I have worked out the rows that match what we are looking for:
SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))
The IF statement here is fairly self explanatory but as we have used it as an array formula, it will perform =Sheet4!$A2 which is the unique reference on every cell in the named range Reference individually. In your mock data this returns a result of: {FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} for the first entry (I included titles in the range, hence the initial FALSE). IF will do my row calculation* for every true but leave the FALSEs as they are.
This leaves a result of {FALSE;2;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} that SMALL(array,k) will use. SMALL will only work on numeric values and will display the 'k'th result. Again the column trick has been used but to cover more ground, I used another method: ROUNDDOWN(Number,digits) as opposed to using LEFT() Digits here means decimal places so I used 0 to round down to a whole integer for the same result. As this copies across the columns like so: 1, 1, 2, 2, 3, 3, SMALL will alternatively (as the formulas alternate) grab the 1st smallest AccNum then the 1st Smallest AccType before grabbing the 2nd AccNum and Acctype and so forth.
*(Row number of the match minus the first row number of the range then plus 1, again fairly common as a foolproof way to always get the correct row regardless of where the data starts; actually as your data starts on row 1 we could just do ROW(Reference) but I left it as is incase you had data in a different format)
ORIGINAL ANSWER - Same logic as above
Here's your solution in 3 parts
Part 1 being a trick for the auto completion of the titles so that they will hide when not used (in case you will just copay and paste values the whole lot to speed up use again).
=IF(COUNTIF(C2:C11,">""")>0,CONCATENATE("Product ",LEFT((COLUMN(A:A)+1)/2,1)),"") in C
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Prod code ",LEFT((COLUMN(B:B)+1)/2,1)),"") in D
Highlight both of the cells and drag across to stagger the outputs "Product " and "Prod code "
Part 2 would be inputting the unique IDs to the new sheet, I would suggest copying your entire column A across to a new sheet and using DATA > REMOVE DUPLICATES > Continue with current selection to trim out the multiple occurrences of unique IDs.
In column B use =INDEX(Sheet2!$B$1:$B$7,MATCH(Sheet4!$A2,Sheet2!$A$1:$A$7,0)) to get the names pulled across.
Part 3, the INDEX
Once again, we are doing a staggered input here before copying the formula across the page to cover the entirety of the data.
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0)),1),"") in C
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0)),2),"") in D
The formulas of Part 3 will need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter) . This will need to be done before copying the formulas across.
These formulas can now be dragged / copied in all directions and will feed off of the unique ID in column A.
My Answer is already rather long so I haven't gone on to break the formula down. If you have any trouble understanding how this works, let me know and I will be happy to write up a quick guide, breaking it down chunk by chunk for you.
I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....