Duplicate Based on Multi-Coulmn - Only Keep the Earliest and Latest Value - excel

Below is an example of an Excel sheet I'm working with:
Basically, I'm trying to delete the duplicate rows by matching ID, Date and Type. If ID, Date and Type are the same, then, I want to only keep the record with the earliest Time in case of Type = In and the latest Time in case of Type = Out.
So, for example, in the case of ID = 1, there are 3 records for In, I only want to keep the one where Time is: 8:01 as this is the earliest. The other 2 records should be deleted.
Similarly, in the case of ID 3, I want to keep the record where Time = 18:05 as this is the later time out of the 2.
Can this be achieved by Conditional Formatting or is it more complex than that?
Many thanks for your help in advance.

Conditional Formatting does not to me look the best approach. Depending upon how often required (frequently, and VBA may be better) I suggest:
Add a sequential index (so can be sorted back to original order, if required).
Sort on Time (Smallest to Largest) within Type.
Select Type Out and sort Largest to Smallest.
Select all, in Remove Duplicates uncheck all but Name, Date and Type and OK.
Resort and delete index, if desired.

Related

Auto populate a table based from datas from another table

this is the another version of my first question and I hope I can best explain my problem this time.
From the Table 1, I want to auto populate Table 2 based on this conditions and criteria (below)
From the example, I basically have 3 initial criteria, ON CALL, AVAILABLE, and BREAK
Now for the conditions, I want all Agents from status ON CALL, AVAILABLE, BREAK from Table 1 to be populated on Table 2 (optional: If possible, I wanted only to show agents that HAS a duration of 4 minutes and above from each status). My problem is I always refresh TABLE 1 so I can get an updated data. My goal here is to monitor our agents their current Status and Running Duration, and from that I only need to check on the table 2 so I would see right away who has the highest running duration from each status to be called out.
I only tried MAXIFS function but my problem with it, I can only show 1 result from each status.
What I wanted is to fully populate Table 2 from the data on Table 1. If this is possible with ROW function that would be great, because what I really wanted is a clean Table, and it should only load data if the criteria is met.
Thank you
Something you may be interested in doing is utilizing HSTACK. I am not sure how you are currently obtaining the Agents name in the adjacent column to the results but this would populate both the Agent along with the Duration.
=HSTACK(INDEX(A:C,MATCH(SORT(FILTER(C:C,(C:C>=TIMEVALUE("00:04:00"))*(B:B=H2),""),1,1),C:C,0),1),TEXT(SORT(FILTER(C:C,(C:C>=TIMEVALUE("00:04:00"))*(B:B=H2),""),1,1),"[h]:mm:ss"))
This formula checks Table 1 for any Agent with the status referenced in H2 (Available) that also has a time greater than or equal to 4 mins. It then sorts the results in ascending order and populates the Agent Name that is associated with it. It is dynamic and will produce a table like the following:
Just update the formula to check for "On Call" and "BreaK" as desired for the other two.
UPDATE:
As for conditional formatting, this is utilizing the custom formula posted in the comments. If the formatting of the times are of [h]:mm:ss then you would be looking to do something like this. Notice the 2 cells are highlighted for being between 4 mins and 5 mins.
This is an array solution that spill all the results at once. We use a user LAMBDA function GET to avoid repetition of the same calculation using as input parameter the status (s). The formula works for durations in time format or in text format with a minor modification. On cell E2 put the following formula for durations in time format:
=LET(GET, LAMBDA(s, FILTER(HSTACK(A:A, C:C), (B:B=s)
* IFERROR(C:C >= TIME(0,4,0), FALSE))),
IFERROR(HSTACK(GET("ON CALL"), GET("Available"), GET("Break")),""))
Here is the output:
For durations as text in hh:mm:ss format just replace: C:C >= TIME(0,4,0) with TIMEVALUE(C:C) >= TIME(0,4,0).
The GET function is reused to generate the result for each status. The last IFERROR call is used to remove #N/A values generated by HSTACK when the column doesn't have the maximum number of rows of the output.
The first IFERROR is used to treat the case when the value is not numeric, such has the header. This is because we are using the entire column as input range. Using entire columns produce more concise formulas with less maintenance effort, but it is less efficient, unless you have a good reason to have an open range. If you want to use a specific range instead for the data of the table, then you can remove it and update the ranges accordingly.

Excel - Index/Match to Parse Column Data Until Desired Value Shows

I'm not sure the best way to word this question, but basically I have a big list of different names and IDs that will be visited multiple times with data pulled from a Survey123 form. One of the fields is asking if a part has been repaired, which will be no a maximum of 3 times before turning yes.
I'm using Index/Match to keep track of the dates the visits took place, but if I try it for the repair column it will always just return the first value in the repair column. Is there a way I can have it parse all the repair column values and change the result if it is Yes?
Here is a visual of what I'm trying to achieve, using Index/Match will stop at the first result rather than cycling through.
You may try below formula. If any of visit has Yes in repaired column then it will return Yes or will return No.
=IF(SUMPRODUCT((A3:A5=F3)*(B3:B5=G3)*(D3:D5="Yes"))>0,"Yes","No")
Or you can use XLOOKUP() with Search_Mode option -1 means search last to first order.
=XLOOKUP(1,(A3:A5=F3)*(B3:B5=G3),D3:D5,"",0,-1)
You can use FILTER to achieve this
=FILTER(C2:C4,(A2:A4=F4)*(B2:B4=G4)*(C2:C4="Yes"),"No")

How do I recieve the number of cells based off of three columns in Excel?

I'm not too sure how to word this problem so, I apologize for the vagueness. Here is what I am trying to do though:
I have a large Excel table with a ton of values, I however, only care about 3 columns. The three columns I have are "Project Name", "Active/Planned", and "Week of Month". Here is an example of some values I would have:
Project Name
Active/Planned
Week of Month
StoreProj
Active
2021-07 Jul-Wk1
SecProj
Planned
2021-07 Jul-Wk2
StoreProj
Active
2021-07 Jul-Wk1
Now, I have used a formula to get the number of projects based on a specific week month and avoiding duplicate values for the project name. The code I used returns an integer of the number of projects. Here is what I used:
=IFERROR(ROWS(UNIQUE(FILTER(Table[Project Name],Table[Week of Month]=2021-07 Jul-Wk1))), 0)
This works as intended. Now the issue I am running into is that I need to filter through these rows as I did previously, but now I need to include the "Active/Planned" column. So, I want to be able to see how many projects I have based off of the week of the month and return a number of projects (excluding duplicate names), but be able to filter through that integer output based off of the active/planned projects. So in a perfect scenario I can choose the week of month and if the project is "Active" or "Planned" and see the amount of projects I have.
This might be an easy fix so I apologize, I am just stumped, any help would be greatly appreciated. Thanks!
Work through that step by step, you've got the FILTER function which is giving data to the UNIQUE function, to the ROWS function, and then your IFERROR. However, the data about whether each line/row is 'Active' or 'planned' isn't passed out beyond the FILTER function, so can't be used by anything further on in the above sequence.
Boring theoretical advice out the way, try this;
=COUNT(IF(UNIQUE(FILTER( Table[[Project Name]:[Active/Planned]], Table[Week of Month] = "2021-07 Jul-wk1"))= "Active", 1))
Explanation:
FILTER(...) outputs records with the relevant date filter, however it outputs Table[[Project Name]:[Active/Planned]] - both columns, to ensure all relevant data is there.
UNIQUE(...) Then narrows that down to unique values, although by this stage I'm not 100% sure you need this.
IF(... = "Active", 1) then replaces the 'Active' outputs with 1s
COUNT() returns the number of cells in the above that contain a number (the 1s from the IF())
Yes, you can't use COUNTIF on arrays (and all except that last bullet point above are outputting arrays not single values) - and no, I didn't know that before attempting to answer this question, found it over at a different question!

Index with double Match returns incorrect closest values

I have an planning exported to Excel which looks like the following (tab ' Data'):
Each production line has a number of people working on it. Now is my goal to show how many people are working on a line per minute. We plan per product group, and several product groups combined form waht a line has to do per minute.
To get the production per minute I created the following (tab 'Conversie'):
=INDEX(Data!$H$2:$H$157;MATCH($N$1&A4;Data!$B$2:$B$157&Data!$C$2:$C$157;1))
In the example it works correct. However, the formula doesn't seem to always return the correct "Artikelomschrijving"(H) every time. I get incorrect return values when I extend this formula to other product groups.
I read that the data needs to be sorted ascending cause I use match_type 1. When I do that I get the right returns for some product groups, but the given example suddenly returns incorrect values.
I can't sort both column C and A in ascending order for the formulas to always return correct items. Can you help me to get past this hurdle?
After a little bit of google translate work, if I'm understanding your question correctly, you need to find the "Item Description" (H) of the record where the "Line" (B) = the value in N1 and the time is between the start and end times.
This is an array formula, you have to confirm it with Ctrl+Shift+Enter
=INDEX(Data!$H$2:$H$157,MATCH(1,(Data!$B$2:$B$157=$N$1)*(Data!$C$2:$C$157<$A2)*(Data!$D$2:$D$157>=$A2),0))
OR with semicolon syntax:
=INDEX(Data!$H$2:$H$157;MATCH(1;(Data!$B$2:$B$157=$N$1)*(Data!$C$2:$C$157<$A2)*(Data!$D$2:$D$157>=$A2);0))
I found the solution, thank you for pointing me in the right direction Valon Miller. This is the formula I fixed it with:
=ALS.FOUT(INDEX(Data!$H$2:$H$154;MATCH(1;(Conversie!L$1=Data!$B$2:$B$154)*((Conversie!$A32>=Data!$C$2:$C$154)*(Conversie!$A32<=Data!$D$2:$D$154));0));"")

How to find when two goals are achieved and return date?

I have two list of numbers and a date field. I am trying to return the date when a condition for each field is met. I have List A that tracks X and is increasing every week and the goal is 30,868 and then List B tracks Y and is also increasing every week with a goal of 1,688,888. I would like to find a way to return the date that both conditions are satisfied. What is the way to go about doing this? I am able to do it individually for each using Index/Match, but is there a way to use it to return the date that both goals are met? I am using Excel 2010.
Formulas:
=INDEX($B$1:$B$288,MATCH(TRUE,$B1:$B$288>30868,0))
=INDEX($D$1:$D$288,MATCH(TRUE,$D1:$D$288>1688888,0))
Lists
Individual Formula/Example
Assuming that whenever one reaches the goal first it doesn't lose any could you not use what you have in terms of index/match and get the address of the corresponding date field and then the value of that.
Then just compare that with the other index/match.
If you post your existing formula then it'd be relatively easy for another user to then help you get the corresponding date.

Resources