Using VBA to calculate latest dates of certain tasks? - excel

I am using a spreadsheet to log the tasks completed and in progress of a project. I was to generate some VBA code that can identify the latest delivery date within a task. However, in each task there are various sub tasks.
So the boundaries are the task which are whole numbers, and in between these whole numbers e.g. 46 and 46, are sub tasks.
The latest date needs to be calculated by examining the dates of the tasks between each whole number. E.g. 46.1,46.2,46.3 etc.
Would i be better by using the excel functions or would it be easier to use code?
e.g. the example of an excel function but in vba i would use.
Worksheets("Activity Overview").cells(n, "E").value = "=IFERROR(IF(AGGREGATE(14,7,'Sub Tasks'!S:S/(('Sub Tasks'!A:A>='Activity Overview'!A" & n & ")*('Sub Tasks'!A:A<'Activity Overview'!A" & n + 1 & ")),1),AGGREGATE(14,7,'Sub Tasks'!S:S/(('Sub Tasks'!A:A>='Activity Overview'!A" & n & ")*('Sub Tasks'!A:A<'Activity Overview'!" & n + 1 & ")),1),""""),"""")"

If one does not have MAXIFS then use AGGREGATE:
AGGREGATE is an array type formula and as such the references should be limited to the data range.

Here's a solution using SUMPRODUCT. It basically filters all values between the base value (>=49) and less than the base value plus one (<50).

You can also do this using Power Query aka Get & Transform available in Excel 2010+
Get Data from Table/Range
Change the Date column to Date format (it defaults to DateTime)
Create a custom column which is the Integer of the Task column and Name it Main Task
Delete the original Task column
Group By the Main Task column
New Column Name Latest Date
Operation: Max
Column: Target Delivery Date
If there are any nulls in the Latest Date Column (from tasks with no delivery dates), you can either leave them, or filter them out.
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Task", type number}, {"Target Delivery Date", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Main Task", each Int64.From([Task])),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Main Task"}, {{"Latest Date", each List.Max([Target Delivery Date]), type date}}),
#"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each ([Latest Date] <> null))
#"Filtered Rows"


Translate Excel Formula to Power Query

In my Power Query I have a column that shows different durations on certain items, but it displays an error when attempting to convert on time or duration.
As a solution next to my Excel Table I created a formula that alows to convert the duration in the format I wish to use, but I have not been able to translate the formula into a language that Power Query can understand (I am pretty new to Power Query).
This is how the data is pulled from source:
But I will like it to show like this:
The Excel Formula I am using to accomplish this is:
It will be nice to have it in the Power Query instead of the Excel sheet, as it serves as a learning oportunity.
I am self learning Power Query in Excel so any help is welcomed.
EDIT: In Case of the duration being more than 24:00:00, how will i approach it
Here is the error code it returns
You can add a custom column with the formula:
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
The formula
Splits the text string by the colon into a List
Replacing blanks with {00} and also prepend the list with a {00} element
Retrieve the last three elements and combine them into a colon separated text string.
Use Duration.FromText function to convert to a duration.
Set the data type of the column to duration
In the PQ Editor, a duration will have the format of d.hh:mm:ss, but when you load it back into Excel, you can change that to [hh]:mm:ss
You can accomplish the above all in the PQ User Interface.
Here is M-Code that does the same thing:
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Duration", each Duration.FromText(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Age"})
#"Removed Columns"
You can even do it (using M-Code in the Advanced Editor) without adding a column by using the Table.TransformColumns function:
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Change to Duration" = Table.TransformColumns(#"Changed Type",
{"Age", each Duration.FromText(
{"00"} & List.ReplaceValue(Text.Split(_,":"),"","00",Replacer.ReplaceValue),
":")), type duration})
#"Change to Duration"
All result in:
With your modified data, now showing duration values of more than 23 hours (not allowed in a duration literal in PQ), the transformation will be different. We have to check the hours and break it into days and hours if it is more than 23.
Note: the below edit also assumes there will never be anything entered in the day location; and that entries for minutes and seconds will always be within range. If there might be day values, you will need to just add what's there to the "overflow" from the hours entry
So we change the Custom Column formula to check for that:
split = List.LastN({"00","00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),4),
s = Number.From(List.Last(split)),
m = Number.From(List.LastN(split,2){0}),
hTotal = Number.From(List.LastN(split,3){0}),
h = Number.Mod(hTotal,24),
d = Number.IntegerDivide(hTotal,24)
in #duration(d,h,m,s)
If you might have illegal values for minutes or seconds, you can add logig to check for that also
Also, if you will be loading this into Excel, and you might have total days >31, you will need to format it (in Excel), as [hh]:mm:ss as with the format d.hh:mm:ss Excel cannot display more than 31 days (although the proper value will be stored in the cell)

A Frustrating Set of Variables in Excel

My industry (aftermarket auto components) utilizes a data standard for digital distribution, and I am currently attempting to create a living reference document, formatted with the correct information in the correct way, to make updating our standard database a less time consuming process.
My company has a 'Master Data Sheet' which contains every piece of data for all of the 20k+ products that we sell. All of our pricing and tracking sheets call cells or ranges from the Master Sheet, in addition to most of our front-facing web presence.
Here's my problem. The standard requires that our marketing descriptions be broken into separate lines with a specific identifier code and grouped by item ID:
Item ID Desc Code Desc
CHD001A AAA Brake Kit
CHD001A BAA Cross-drilled...
CHD001A BAA All of our...
CAE221B AAA Replacement Part
CAE221B BAA Reinforced with...
Our Master Data sheet has a different structure:
Item ID Desc - AAA Desc - BAA Desc - BAA
CHD001A Brake Kit Cross drilled... All of our...
CAE221B Replacement Part Reinforced with...
I'm completely stuck on how to get the right info into the right slots. I CANNOT alter the structure of the Master Sheet or I will have to remap at least thirty other spreadsheets. A VLOOKUP won't work in the horizontal way it needs to, and IF statements will get 20 nests in and then lack have a good way to group things. Please help.
Assuming that your task is to find the description of item CH001A in the master db, and that you know the description code, you can use INDEX/MATCH. Here's the setup I used for developing the formula.
I created a simulation of your master in A1:D4 (one row more than the example in your question.
I assigned G2 as the cell where I would enter the Item ID and G3 to enter the Desc Code.
Now the formula =IFERROR(MATCH(G2,A1:A4,0), 1) finds the sheet row number by Item ID and =IFERROR(MATCH("Desc - " & G3,A1:D1,0),1) finds the sheet column number by Desc code. Note that both formulas default to 1 if not found.
Now the formula below will return the description.
=INDEX(A1:D4,IFERROR(MATCH(G2,A1:A4,0), 1),IFERROR(MATCH("Desc - " & G3,A1:D1,0),1))
Observe that the db range A1:D4 includes the captions and both range A1:A4 and A1:D1 start from the extreme top or left. This enables a column or row caption to be displayed in case of error (when an Item ID or Desc Code isn't found).
The formula isn't perfect yet but the method is. I take it that you will be able to tweak it to optimize adaptation to your needs. Let me know if you need help or advice with that.
Very simple to do with Power Query, available in Excel 2010+
Select some cell in the Master Table
If this is not a real Table, it will be changed into one
Data / Get&Transform / From Table/Range
In the PQ editor window that opens, select the Item Id column
Unpivot other columns
Split the resultant Attribute column by `Transition from non-digit to digit
This will get rid of the automatically created suffixes caused by creating
a table with initially identical column headers
If there are digits in the code itself, you'll need to remove the terminal digit(s) using a custom column with a formula.
The best way to do that would depend on the actual structure of your values
Delete the Attribute.2 column (the one with the terminal digits)
Rename the columns appropriately
Here is the generated M Code.
You can just paste this into the Advanced Editor of PQ. If you do, be sure to change the Table Name in Line 2 to whatever your real table name for the Master Data turns out to be.
Source = Excel.CurrentWorkbook(){[Name="Master"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Item ID", type text}, {"Desc - AAA", type text}, {"Desc - BAA", type text}, {"Desc - BAA2", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Item ID"}, "Attribute", "Value"),
#"Split Column by Character Transition" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9"}, c), {"0".."9"}), {"Attribute.1", "Attribute.2"}),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Character Transition",{"Attribute.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Attribute.1", "Desc Code"}, {"Value", "Desc"}}),
#"Filtered Rows" = Table.SelectRows(#"Renamed Columns", each ([Desc] <> ""))
#"Filtered Rows"

Excel Power Pivot - Overlapping Date Ranges

I have a file with 50,000 lines of data in 3 columns- Unique ID, Start Date, and End Date.
Using Power Pivot, I need to determine if any records with the same Unique ID have any overlapping dates. Each Unique ID appears about 5 times.
In excel, I would use a formula
SUMPRODUCT: =SUMPRODUCT(($B3<=$C$3:$C$13)*($C3>=$B$3:$B$13)*($A$3:$A$13=A3))>1
While this formula works really well in excel, with 50k+ records, this breaks my computer.
I was wondering, how would I perform that same calculation in power pivot/query.
Example of the data and calculation.
Thank you so much!
following a PowerQuery M-Code, this will solve your problem. Don't know how long it will take for 50k rows:
Quelle = Excel.CurrentWorkbook(){[Name="tab_Dates"]}[Content],
Change_Type = Table.TransformColumnTypes(Quelle,{{"Unique ID", type text}, {"Start Date", type date}, {"End Date", type date}}),
add_List_Dates = Table.AddColumn(Change_Type, "List_Dates", each List.Dates([Start Date], Duration.Days([End Date]-[Start Date])+1 , #duration(1,0,0,0))),
expand_List_Dates = Table.ExpandListColumn(add_List_Dates, "List_Dates"),
add_CountIF_ID_Date = Table.AddColumn(expand_List_Dates, "CountIF_ID_Date", (CountRows) =>
([Unique ID] = CountRows[Unique ID] and [List_Dates] = CountRows[List_Dates])))),
Change_Type_2 = Table.TransformColumnTypes(add_CountIF_ID_Date,{{"CountIF_ID_Date", type text}}),
ChangeValue_CountIF_ID_Date = Table.ReplaceValue(Change_Type_2, each [CountIF_ID_Date], each if [CountIF_ID_Date] <> "1" then "FALSE" else "TRUE",Replacer.ReplaceText,{"CountIF_ID_Date"}),
Remove_Column_List_Dates = Table.RemoveColumns(ChangeValue_CountIF_ID_Date,{"List_Dates"}),
Remove_Duplicates = Table.Distinct(Remove_Column_List_Dates)

How do I calculate Percentiles in PowerQuery based on grouping variables?

I have a few columns of data, I need to convert the excel version of "PERCENTILE" into Powerquery format.
I have some code which adds in as a function but doesnt apply accurately as it doesnt allow for grouping of the data by CATEGORY and YEAR. So anything that is in Full Discretionary 1.5-2.5 AND 2014 needs to be added to the percentile array, equally anything that falls in Full discretionary 2.5-3.5 AND 2014 needs to go into a different percentile array
Source = (list as any, k as number) => let
Source = list,
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Sorted Rows" = Table.Sort(#"Converted to Table",{{"Column1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "TheIndex", each Table.RowCount(#"Converted to Table")*k/100),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Index] >= [TheIndex] and [Index] <= [TheIndex]+1),
Custom1 = List.Average(#"Filtered Rows"[Column1])
So Expected results would be that anything that matches off on the 2 columns (Year,Category) should be applied within the same array. Currently invoking the above function just gives me errors.
I have also tried using grouping and outputting the "Min, Median, and Max" outputs but I also require 10% and 90% Percentiles.
Thank you in advance
Based on some findings on other websites and alot of googling (most folk just want to use DAX but if youre only using Power Query you cant!) someone posted an answer which is very helpful:
/PercentileInclusive Function
(inputSeries as list, percentile as number) =>
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
The above will replicate the PERCENTILE function found within Excel - you pass this as a query using "New Query" and advanced editor. Then call it in after grouping your data -
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
List.Average([Amount Sales]), type number}})
In the above formula, RenamedColumns is the name of the previous step
in the script. Change the name to match your actual case. I've assumed
that the pre-grouping sales amount column is "Amount Sales." Names of
grouped columns are "Sales Total" and "95 Percentile Sales."
Next modify the group formula, substituting List.Average with
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
PercentileInclusive([Amount Sales],0.95), type number}})
This worked for my data set and matches similar

Power Query Adding a row that sums up previous columns

I'm trying to create a query that sums up a column of values and puts the sum as a new row in the same table. I know I can do this using the group function but it doesn't do it exactly as I need it to do. I'm trying to create an accounting Journal Entry and I need to calculate the offsetting for a long list of debits. I know this is accountant talk. Here's a sample of the table I am using.
Date GL Num GL Name Location Amount
1/31 8000 Payroll Office 7000.00
1/31 8000 Payroll Remote 1750.00
1/31 8000 Payroll City 1800.00
1/31 8010 Taxes Office 600.00
1/31 8010 Taxes Remote 225.00
1/31 8010 Taxes City 240.00
1/31 3000 Accrual All (This needs to be the negative sum of all other rows)
I have been using the Group By functions and grouping by Date with the result being the sum of Amount but that eliminates the previous rows and the four columns except Date. I need to keep all rows and columns, putting the sum in the same Amount column if possible. If the sum has to be in a new column, I can work with that as long as the other columns and rows remain. I also need to enter the GL Num, GL Name, and Location values for this sum row. These three values will not change. They will always be 3000, Accrual, All. The date will change based upon the date used in the actual data. I would prefer to do this all in Power Query (Get & Transform) if possible. I can do it via VBA but I'm trying to make this effortless for others to use.
What you can do it calculate the accrual rows in a separate query and then append them.
Duplicate your query.
Group by Date and sum over Amount. This should return the following:
Date Amount
1/31 11615
Multiply your Amount column by -1. (Transform > Standard > Multiply)
Add custom columns for GL Num, GL Name and Location with the fixed values you choose.
Date Amount GL Num GL Name Location
1/31 11615 3000 Accrual All
Append this table to your original. (Home > Append Queries.)
You can also roll this all up into a single query like this:
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
OriginalTable = Table.TransformColumnTypes(Source,{{"Date", type date}, {"GL Num", Int64.Type}, {"GL Name", type text}, {"Location", type text}, {"Amount", Int64.Type}}),
#"Grouped Rows" = Table.Group(OriginalTable, {"Date"}, {{"Amount", each List.Sum([Amount]), type number}}),
#"Multiplied Column" = Table.TransformColumns(#"Grouped Rows", {{"Amount", each _ * -1, type number}}),
#"Added Custom" = Table.AddColumn(#"Multiplied Column", "GL Num", each 3000),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "GL Name", each "Accrual"),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Location", each "All"),
#"Appended Query" = Table.Combine({OriginalTable, #"Added Custom2"})
#"Appended Query"
Note that we are appending the last step with an earlier step in the query instead of referencing a different query.
