aggregating duplicates in excel - excel

I have a excel file whose sample looks like the img attached. The highlighted numbers are the duplicate ids which I have. Now column E basically counts the number of days by (C-B) However, I want just one id per row. So for id 55555 the start date is 10/25/2017 and end date should be 1/14/2018 which will now make the number of days to be 61+19=80. However, I am unable to think of a formula which can do this. Any help on this will be greatly appreciated

#Karan Kashyap import & drag down the below formula on column E.
=SUMPRODUCT(--($A$2:$A$6=A2),$E$2:$E$6)
Output:

Dates are numbers, nothing more. They represent the number of days since 31-Dec-1899. All you need is MAXIFS minus MINIFS. If your version of Excel doesn't support MAXIFS or MINIFS then here are some pre-Excel 2016, Office 365 alternatives.
'Minimum start date if id = 55555
'option 1 for xl2010 and newer
=AGGREGATE(15, 7, B$2:B$6/(A$2:A$6=A2), 1)
'option 2 for pre-xl2010
=MIN(INDEX(B$2:B$6+(A$2:A$6<>A2)*1E+99, , ))
'Maximum end date if id = 55555
'option 1 for xl2010 and newer
=AGGREGATE(14, 7, C$2:C$6/(A$2:A$6=A2), 1)
'option 2 for pre-xl2010
=MAX(INDEX(C$2:C$6-(A$2:A$6<>A2)*1E+99, , ))
I've used 0 \d\a\y\s_) for a custom number format in column E.

Related

array formula: sum to date

I have a number of invoices:
invoice #
start
end
paid on
amount
paid to date (hardcoded)
1
01/01/2020
30/01/2020
01/02/2020
£10.00
£10.10
2
01/02/2020
20/02/2020
01/03/2020
£7.50
£17.60
3
21/02/2020
30/02/2020
01/03/2020
£2.50
£20.10
4
01/01/2000
30/01/2000
01/03/2000
£0.10
£0.10
Where the invoices
are not necessarily sorted by start and end date
are not necessarily sorted by paid date.
might not have a paid-on value.
I want to add a field called paid to date that would show me the amount I have been paid so far where it would add the amount for:
the current invoice
the invoices that were paid prior to this invoice's paid-on date.
the invoices with the same paid-on date as this invoice but with a start date <= this invoice's start date.
Effectively mirroring the hard-coded column.
This is how I do it with a query (which might not be the simplest/most elegant way of doing it)
=Index(
query(
A1:E10,
"select SUM(E)
where D is not null
and (
D < date '"& TEXT( D2,"yyyy-mm-dd")&"'
OR (
D = date '"& TEXT( D2,"yyyy-mm-dd")&"'
and
B <= date '"& TEXT( B2,"yyyy-mm-dd")&"'
)
)"
),
2, 1
)
which is all well and good. but I want to be able to do it with array-formula, so I can have it auto-generated for me.
I tried using it inside array-formula but the value is only ever generated for the first row. I guess it's misinterpreting the range I am passing as the range of the query function, ie A1:E10. is there an easy way of fixing it?
Do I need to use VLookup?
Here is a sample spreadsheet.
try:
=INDEX(IFNA(VLOOKUP(A2:A, {INDEX(SORT({A2:B, D2:E}, 2, 1, 3, 1),,1),
MMULT(TRANSPOSE((ROW(E2:E)<=TRANSPOSE(ROW(E2:E)))*
INDEX(SORT({A2:B, D2:E}, 2, 1, 3, 1),,4)), SIGN(E2:E))}, 2, 0)))
If you have Excel 365, you can Filter directly.
Note that I am using a table with structured references. Then you don't need to adjust the range references as you add/remove rows from the table.
=SUM(FILTER([amount],([paid on]<[#[paid on]])+(([paid on]=[#[paid on]])*([start]<[#start])),0),[#amount])
This part does the filtering:
...([paid on]<[#[paid on]])+(([paid on]=[#[paid on]])*([start]<[#start]))...
Note: You can use the same algorithm in Sheets:
Note that I am using regular addressing, and also using a semicolon instead of a comma for the argument separators
=sum(iferror(FILTER($E$2:$E$5;($D$2:$D$5<$D2)+(($D$2:$D$5=$D2)*($B$2:$B$5<$B2)));0);E2)

Is there a way to distribute data according to a logic in Excel vba?

I have an Excel sheet with the below data.
There are 10,000 Data rows.
9000 are of "USA" & 1000 are of "Other" country.
I want to evenly distribute the data so that when I have 9 "USA" followed by 1 "Other" data distributed throughout.
Name
Country
Alice
USA
Brook
Other
Cathy
USA
David
USA
Esther
Other
Freddy
USA
Galin
USA
Henry
Other
Indigo
USA
Jenny
USA
Kalin
Other
Linda
USA
How do I accomplish this using manual & excel VBA? Appreciate both solutions. Thanks
This can be achieved with a formula if you have the newest version of Excel.
Try something like (adapt ranges and what you are filtering on as necessary):
=LET(x, FILTER($B$1:$C$12, $C$1:$C$12="a"),
y, FILTER($B$1:$C$12, $C$1:$C$12="b"),
z, ROW(D1:D12), myrows, MAX(z),
ratio, MAX((COUNTA(x)/2)/(COUNTA(y)/2), (COUNTA(y)/2)/(COUNTA(x)/2))+1,
IF(MOD(z,ratio)<>0,
INDEX(x, IF(MOD(SEQUENCE(myrows),ratio)=0, 0, SEQUENCE(myrows)-CEILING(ROW(G1:G12)/ratio-1,1)), SEQUENCE(1,2)),
INDEX(y, IF(MOD(SEQUENCE(myrows),ratio)<>0,0,SEQUENCE(myrows)/ratio), SEQUENCE(1,2))))
For example:
The trick is to create the "correct" sequence for each result; for the first array you want to skip every nth row (in your case 10), and having the nth+1 row not default to n+1, but n, while in the second array you want to skip every row that isn't a some multiple of n, and have the nth rows count sequentially.
A caveat-- as is, I don't believe the formula will work with repetition other than 1, i.e. if you want to do something like 8 rows followed by 2 rows, this won't work.
This works even with older Excel versions:
If this is your data:
Add a Sort column with the following formula in C2 and pull it down:
=IF(B2="USA",COUNTIF($B$2:B2,"USA")+INT((COUNTIF($B$2:B2,"USA")-1)/ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)),COUNTIF($B$2:B2,"Other")*(ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)+1))
Then sort by this column C and USA and Other are evenly spread:

Combining formula using and / or in excel

Case 1
I have a set of data which i need to determine if the cell is in Business Hours or Not.
8 - 18 (08:00 - 18:00) Business Hours (BH)
outside the timeframe is Non Business Hours (NBH)
Given Cell value for example is = "7" (which is NBH)
here is the formula i created =if(AND(C2>=8,C2<=18 ),"BH","NBH")
Case 2
I have a set of data for days in a week, i need to determine if the cell is in Weekdays or Weekends.
I have this formula = =if(OR(I2="Saturday", I2="Sunday"), "NBH", "BH")
note : i used the same variable name NBH - Weekends , BH - Weekdays
What I really need to do is to combine those two cases into 1 formula.
I need to output these scenarios correctly, listing below :
Time is 08:00, date is Saturday/Sunday = Combined formula of case 1 and 2 should output "NBH"
Time is 07:00, date is Monday-Friday = Combined formula of case 1 and 2 should output "NBH"
Time is 12:00, date is Monday-Friday = Combined formula of case 1 and 2 should output "BH"
Formulas can be seen in column BH/NBH WEEKDAYS and BH/NBH Weekends, you can browse attached file thanks much!
Click to access the file
If you want to calculate it directly on the initital values:
=IF(OR(H2="Saturday", H2="Sunday", B2>18, B2<8), "NBH", "BH")
p.s. Alternatively you can combine the already calculated columns, if you intend to keep these columns:
=IF(AND((E2="BH"), (D2="BH")), "BH", "NBH")

How to count occurrences of each year in Excel?

I want to count which cells have a specific year in their date/string.
I have a problem where my formula only works if it's a valid date, some cells have month or day missing or are totally blank.
Here are some examples of values I want to be able to count:
2002-07-?
2010-11-27
2009-10-21
2009-10-21
2004-12-20
2004-11-07
2010-11-?
2004-09-17
2000-?-?
2005-04-26
This is how I want the output to be:
Unknown 2
2000 1
2001 0
2002 1
2003 0
2004 3
2005 1
2006 0
2007 0
2008 0
2009 2
2010 2
If I use =COUNTIF(A1:A12;"2000*") I only get those cells which are strings. Is there a way I could count both dates and strings?
Use a helper column and use the following formula to extract the year:
=IF(ISTEXT(A1);LEFT(A1;4);TEXT(A1;"YYYY"))
Then use your existing =COUNTIF() formula but without the wildcard * argument:
=COUNTIF(A1:A12;"2000")
Haven't got Excel to hand to test this, but I believe you can have another column that converts the value to text- I think it's =TEXT(A1,"<format>"), then just do your 'countif' on that.
EDIT: Forgot about the second argument. I'm surprised it didn't work with the 'yyyy' argument though.

MS Excel: Using AGGREGATE to add up all mileage in each month

I have the following data in a logbook format:
DATE MILEAGE
02-Jul-13 15
05-Jul-13 12
09-Jul-13 156
10-Aug-13 20
11-Aug-13 20
12-Aug-13 232
12-Aug-13 20
13-Aug-13 265
15-Aug-13 20
18-Aug-13 20
I am looking to extract data from it.
I need to ignore errors and #N/A so I have been trying to use the AGGREGATE function. To no avail though.
I would like to present the following information:
Mileage this month -
=AGGREGATE(9,7, IF(MONTH(IFERROR(LogBookTable[Date], 0)) = MONTH(TODAY()), LogBookTable[Total KM], 0)) - Does not work
Mileage in July -
=AGGREGATE(9,7, IF(MONTH(IFERROR(LogBookTable[Date], 0)) = MONTH(7), LogBookTable[Total KM], 0)) - Does not work
Mileage in August -
=AGGREGATE(9,7, IF(MONTH(IFERROR(LogBookTable[Date], 0)) = MONTH(8), LogBookTable[Total KM], 0)) - Does not work
Total Mileage -
=AGGREGATE(9,7,LogBookTable[Total KM]) - This works
The monthly mileage and current month mileage all result in a "#VALUE!" being displayed.
Any assistance would be much appreciated.
Just in case anyone asks, the naming schemes are corrected, it's not the references that aren't working, it's the values.
You could use DSUM. If you had two criteria cells setup Say in D1:E2 as follows:
Date Date
>1/07/2013 <31/7/2013
and if your data was in A1:B11 (as per your example above), to return a sum for July ignoring errors, use the following formula:
=DSUM(A1:B11,2,D1:E2)
You could setup the criteria such that they were updated based on the current month, and that the month end is calculated using:
="<"& DATE(YEAR(TODAY()),MONTH(TODAY())+1,DAY(1)-1)
and the month start calculated by:
=">"& DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(1))
Remember that with criteria, multiple criteria on the same row is an AND, and multiple rows on the same criteria is an OR.
You can use SUMIF is you add a column representing the month of the date:
Column B contains =MONTH of Column A. All you have to do is update the DATE entry in A16 with a new date. B16 is accordingly updated with the appropriate month, creating a new monthly aggregate/sum.

Resources