I’m running a compliance project that has work we need to do (WorkToDo column) mapped to each law that requires we do it (Law#1, Law#2, Law#3):
I’d like to generate a pivot table that maps that shows each law mapped to the work required by it:
What’s the best way to do this?
In powerquery, dynamic for number of columns and rows
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"WorkToDo"}, "Attribute", "Value"),
x = Table.Group( #"Unpivoted Other Columns", {"Attribute"}, {{"Concat", each Text.Combine([WorkToDo],":"), type text}}),
DynamicColumnList = List.Transform({1 ..List.Max(Table.AddColumn(x,"Custom", each List.Count(Text.PositionOfAny([Concat],{":"},Occurrence.All)))[Custom])+1}, each "Item." & Text.From(_)),
#"Split Column by Delimiter" = Table.SplitColumn( x, "Concat", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), DynamicColumnList)
in #"Split Column by Delimiter"
You can do it in Power Query
Close and Load
Please see the steps on the picture. In this way you don't get empty cells in between.
I have table; The table consists of pairs of date and value columns
Pair Pair Pair Pair .... ..... ......
What I need is the sum of all values for the same date.
The total table has 3146 columns (so 1573 pairs of value and date)!! with up to 186 entries on row level.
Thankfully, the first column contains all possible date values.
Considering the 3146 columns I am not sure how to do that without doing massivle amount of small steps :(
This shows a different method of creating the two column table that you will group by Date and return the Sum. Might be faster than the List.Accumulate method. Certainly worth a try in view of your comment above.
Unpivot the original table
Add 0-based Index column; then IntegerDivide by 2
Group by the IntegerDivide column and extract the Date and Value to separate columns.
Then group by date and aggregate by sum
let
Source = Excel.CurrentWorkbook(){[Name="Table12"]}[Content],
//assuming only columns are Date and Value, this will set the data types for any number of columns
Types = List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,1), each {_, type date}) &
List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,0), each {_, type number}),
#"Changed Type" = Table.TransformColumnTypes(Source,Types),
//Unpivot all columns to create a two column table
//The Value.1 table will alternate the related Date/Value
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {}, "Attribute", "Value.1"),
//add a column to group the pairs of values
//below two lines => a column in sequence of 0,0,1,1,2,2,3,3, ...
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1, Int64.Type),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 2), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Integer-Division",{"Index"}),
// Group by the "pairing" sequence,
// Extract the Date and Value to new columns
// => a 2 column table
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Integer-Division"}, {
{"Date", each [Value.1]{0}, type date},
{"Value", each [Value.1]{1}, type number}}),
#"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Integer-Division"}),
//Group by Date and aggregate by Sum
#"Grouped Rows1" = Table.Group(#"Removed Columns1", {"Date"}, {{"Sum Values", each List.Sum([Value]), type number}}),
//Sort into date order
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"Date", Order.Ascending}})
in
#"Sorted Rows"
Quick google shows "Number of columns per table 16,384" for powerquery and 16000 for powerBI, so I'm thinking you have to split your input data somehow first, or perhaps this is not the tool for you, maybe AWK
Assuming that works, an M version of what you are looking for. It stacks the columns in groups of 2, then groups and sums them
let Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
Combo = List.Split(Table.ColumnNames(Source),2),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)
),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Column1"}, {{"Sum", each List.Sum([Column2]), type number}})
in #"Grouped Rows"
186 rows * 1573 pairs of columns = 292,578 records.
Assuming not a very old version of Excel, 293k rows is fine, so it can be done with formulae:
Insert five columns to the left, so data starts in F3.
In A3 put zero, in A4 put 1, select the two and drag down to A188.
In A189 put =A3.
In B3 put 0, and drag down to B188.
In B189 put =B3
"Drag"* down A189 and B189 to row 292580
In C3 put =OFFSET($F$3,A3,B3)
In D3 put =OFFSET($F$3,A3,B3+1)
Select those two cells and click on the cross at bottom right to copy them to the end of column B.
Then put Date and Value in A1 and B1, and use a Pivot Table to get totals, averages, or whatever you need.
Any blank cells in the original input do not matter.
to "drag" down hundred of thousands of cells:
Copy A189 and B189
Goto (F5) A292580
Paste
Pin (F8)
CTRL-up arrow
Enter
And rather than $F$3 I would name that cell Origin, and use "Origin" in the two Offset formulae, but many people seem to consider that too complicated.
Problem: Excel has the ability to recognise dates for instance if you write 01-Jan-2021 excel will convert this to 01/01/2021. Is it possible using power query to somehow use this as a sort of delimiter to split dates from Cells?
So far: Currently I am using a colon as a delimiter which works well but of course if this is accidently missed, then the date remains unsplit form the rest of the data.
Is it possible without this to automatically split dates and times etc from other data contained within a cell?
Alternative solution? Not sure if this could work but alternative is it possible to apply some conditional formatting so that all Dates are followed with a colon? The problem is suspect is that since the cell contains other text it wont be recognised as a date. This may also be an issue in general but if necessary using power query I could split the data as necessary
The image shows what I am trying to achieve
omitting any custom code, you can do this from the interface fairly simply
Split by line-feed into rows
duplicate column
transform the data type on new column to date and then right click column and replace errors with null
duplicate that 2nd column and right click it and fill down; you now have 3 columns
filter 2nd column=null, and if you need to, original column <> blank
remove the 2nd column
its a 2 minute operation but code generated is:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1", type text}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type1", "Column1", "Column1 - Copy"),
#"Changed Type2" = Table.TransformColumnTypes(#"Duplicated Column",{{"Column1 - Copy", type date}}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type2", {{"Column1 - Copy", null}}),
#"Duplicated Column1" = Table.DuplicateColumn(#"Replaced Errors", "Column1 - Copy", "Column1 - Copy - Copy"),
#"Filled Down" = Table.FillDown(#"Duplicated Column1",{"Column1 - Copy - Copy"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([#"Column1 - Copy"] = null) and ([Column1] <> "")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Column1 - Copy"})
in #"Removed Columns"
I am trying to group/merge two rows by dividing the values in each based on another column (Eligible) value.
From the initial raw data, I have reached this level with different steps (by unpivoting etc.) in power query.
Now I need to have a ratio per employee (eligible/not-eligible) for each month.
So for employee A, "Jan-14" will be -10/(-10 + -149) and so on. Any ideas will be appreciated. Thanks
Really appreciate the response. Interestingly, I have used your other answer to reach this stage from the raw data.
Since we are calculating how much time an employee worked on eligible activities each month so We will be grouping on the Employee. Employee name was just for reference which I took out and later will join with employee query to get the names if required. There was a typo in the image, the last row should also be an employee with id 2.
So now when there is a matching row, we use the formula to calculate the percentage of time spent on eligible activities but
If there isn't a matching row with eligible=1, then the outcome should be 0
if there isn't a matching row with eligible-0, then the outcome should be 1 (100%)
Try this and modify as needed. It assumes you are starting with all Eligible=0 and will only pick up matching Eligible=1. If there is a E=1 without E=0 it is removed. Also assumes we match on both Employee and EmployeeName
~ ~ ~ ~ ~
Click select the first three columns (Employee, EmployeeName, Eligible), right click .... Unpivot other other columns
Add custom column with name "One" and formula =1+[Eligible]
Merge the table onto itself, Home .. Merge Queries... with join kind Left Outer
Click to match Employee, EmployeeName and Attribute columns in the two boxes, and match One column in the top box to the Eligible Column in the bottom box
In the new column, use arrows atop the column to expand, choosing [x] onlt the Value column. Make the name of the column: Custom.Value
Add column .. custom column ... formula = [Custom.Value] / ( [Custom.Value] + [Value])
Filter Eligible to only pick up the zeroes using the arrow atop that column
Remove extra columns
Click select Attribute column, Transform ... pivot ... use custom as the values column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Employee", "EmployeeName", "Eligible"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "One", each 1+[Eligible]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Employee", "EmployeeName", "Attribute", "One"},#"Added Custom",{"Employee", "EmployeeName", "Attribute", "Eligible"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"Value"}, {"Custom.Value"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Added Custom", "Custom", each [Custom.Value]/([Custom.Value]+[Value])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Eligible] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Eligible", "Value", "One", "Custom.Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Custom", List.Sum)
in #"Pivoted Column"
I am attempting to extract cells through a combination of index(match) and right(len)-find() functions from an array of data with text. In my formula, I am searching for instances of "* DS#1 ", excel returns those but also returns instances with " DS#11 *". How do I get excel to return only DS#1?
I have attempted to use an if statement with no success, if(formula="* 11 *","",formula).
Below is a link to an example of the data. The first cell highlighted in yellow should not be returning that text, it should be "". The second cell highlighted in yellow is appropriate to return that data.
example data
=RIGHT(INDEX($V:$AC,MATCH("DS#1",$AC:$AC,0),1),LEN(INDEX($V:$AC,MATCH(FW$1,$AC:$AC,0),1))-FIND($AG2,INDEX($V:$AC,MATCH(FW$1,$AC:$AC,0),1))+1)
Here some example on how to find a value and check for the following char.
Formula in D2:
=INDEX(A2:A6,MATCH(1,INDEX((ISNUMBER(SEARCH("DS#1",B2:B6)))*(NOT(ISNUMBER(MID(B2:B6,SEARCH(C2,B2:B6)+LEN(C2),1)*1))),0),0))
Here is a formula you can adapt to your ranges which will return a list from the range rngDS that contain findDS. I used named ranges, but you can adapt them to your own ranges.
Not sure if this is what you want since you chose to not post examples of your data or desired results.
The routine finds the findDS string and then checks to be sure that the following character is non-numeric.
C1: =IFERROR(INDEX(rngDS,AGGREGATE(15,6,1/(NOT(ISNUMBER(-MID(rngDS,SEARCH(findDS,rngDS)+LEN(findDS),1))+ISERROR(MID(rngDS,SEARCH(findDS,rngDS)+LEN(findDS),1))))*ROW(rngDS),ROWS($1:1))),"")
and fill down
It would be very difficult to come up a formula based solution especially when you need to first differentiate DS1, DS#1. DS#11, DS#11X etc. then look for the text string after each DS code, not to mention these confusing codes may (or may not) be positioned in random orders in the text string.
A better approach would be using Power Query which is available in Excel 2010 and later versions. My solution is using Excel 2016.
Presume you have the following two tables:
You can use From Table function in the Data tab to add both tables to the Power Query Editor.
Once added, make a duplicate copy of Table 1. I have renamed the duplicate as Table1 (2) - Number Ref. Then you should have three un-edited queries:
If your source data is a larger table containing some other information, you can google how to add a worksheet to the editor and how to remove unnecessary columns and remove duplicated values.
Firstly, let's start working with Table1.
Here are the steps:
Use Replace Values function to remove all # from the text string, and then replace all DS with DS# in the text string, so all DS codes are in the format of DS#XXX. Eg. DS8 will be changed to DS#8. This step may not be necessary if DS8 is a valid code as well as DS#8;
Use Split Column function to split the text strings by the word DS, and put each sub text string into a new row, then you should have the following:
Use Split Column function again to split the text strings by 1 Character from the left and you should have the following:
Filter the first column to show hash tag # only and then remove the first column, then you should have the following:
use Replace Values function repeatedly to remove the following characters/symbols from the text strings: (, ), HT, JH, SK, //, and replace dash - with space . I presume these are irrelevant in the comment but you can leave them if needed. Then you should have:
use Split Column function again to split the text string by the first space on the left, then you should have:
Then you can Trim and Clean the second column to further tidy up the comments, rename the columns as DS#, Comments, and Number Ref consecutively, and change the format of the third column to Text. Then you should have:
The last step is to add a custom column called Match ID to combine the value from first and third column into one text string as shown below:
Secondly, let's work on Table1 (2) - Number Ref
Here are the steps:
Remove the first column so leave the Number Ref column as the single column;
Transpose the column, and Promote the first row as header. Then you should have:
The purpose of this query is to transform all Number Reference into column headers, then append this query with the next query (Table2) to achieve the desired result which I will explain in the next section.
Lastly, let's work on the third query Table2.
Here are the steps:
Append this table with the Number Ref table from previous step;
highlight the whole table and use Replace Values function to replace all null with number 1. Then highlight the first column and use Unpivot Other Columns function to transform the table as below:
then Remove the last column (Value), and add a new custom column called Match ID to combine the DS code with the Number Reference. Then you should have:
Merge the table with Table1 using Match ID as shown below:
Expand the newly merged column Table1 to show Comments;
Use Split Column function to split the first column by Non-digit to digit, change the format of the digit column to whole number, and then sort Attribute column and the digit column ascending consecutively, then you should have:
Use Split Column function again to split the Match ID column by dash sign -, and remove the first three columns, rename the remaining three columns as DS#, Number Ref and Comments consecutively, then you should have:
Close & Load this table to a new worksheet as desired, which may look like this:
In conclusion, It is entirely up to you how you would like to structure the table in Power Query. You can pre-filter the Number Reference in the editor and load only relevant results to a worksheet, you can load the full table to a worksheet and use VLOOKUP or INDEX to retrieve the data as desired, or you can load the third query to data model from where you can create pivot tables to play around.
Here are the codes behind the scene for reference only. All steps are using built-in functions of the editor without any advanced manual coding.
Table1
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Comments", type text}}),
#"Replaced Value8" = Table.ReplaceValue(#"Changed Type","#","",Replacer.ReplaceText,{"Comments"}),
#"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","DS","DS#",Replacer.ReplaceText,{"Comments"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value9", {{"Comments", Splitter.SplitTextByDelimiter("DS", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Comments"),
#"Split Column by Position" = Table.SplitColumn(#"Split Column by Delimiter", "Comments", Splitter.SplitTextByPositions({0, 1}, false), {"Comments.1", "Comments.2"}),
#"Filtered Rows" = Table.SelectRows(#"Split Column by Position", each ([Comments.1] = "#")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Comments.1"}),
#"Replaced Value1" = Table.ReplaceValue(#"Removed Columns",")","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","-"," ",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2","(","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3","HT","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4","JH","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5","SK","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6","//","",Replacer.ReplaceText,{"Comments.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Replaced Value7", "Comments.2", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Comments.2.1", "Comments.2.2"}),
#"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter1",{{"Comments.2.2", Text.Trim, type text}}),
#"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Comments.2.2", Text.Clean, type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Cleaned Text",{{"Comments.2.1", "DS#"}, {"Comments.2.2", "Comments"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Number Ref", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Match ID", each "DS#"&[#"DS#"]&"-"&[Number Ref])
in
#"Added Custom"
Table1 (2) - Number Ref
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Comments", type text}, {"Number Ref", Int64.Type}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Number Ref"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Number Ref", type text}}),
#"Transposed Table" = Table.Transpose(#"Changed Type1"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type2" = Table.TransformColumnTypes(#"Promoted Headers",{{"388", type any}, {"1", type any}})
in
#"Changed Type2"
Table2
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"List", type text}}),
#"Appended Query" = Table.Combine({#"Changed Type", #"Table1 (2) - Number Ref"}),
#"Replaced Value" = Table.ReplaceValue(#"Appended Query",null,"1",Replacer.ReplaceValue,{"List", "388", "1"}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Value", {"List"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Value"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Match ID", each [List]&"-"&[Attribute]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Match ID"}, Table1, {"Match ID"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Comments"}, {"Comments"}),
#"Split Column by Character Transition" = Table.SplitColumn(#"Expanded Table1", "List", Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9"}, c), {"0".."9"}), {"List.1", "List.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Character Transition",{{"List.2", Int64.Type}}),
#"Sorted Rows" = Table.Sort(#"Changed Type1",{{"Attribute", Order.Ascending}, {"List.2", Order.Ascending}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Sorted Rows", "Match ID", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"Match ID.1", "Match ID.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"List.1", type text}, {"Match ID.1", type text}, {"Match ID.2", type text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type2",{"Match ID.1", "Match ID.2", "Comments"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Match ID.1", "DS#"}, {"Match ID.2", "Number Ref"}})
in
#"Renamed Columns"
Cheers :)
Based on the creative answers from this post, I was able to take some of these ideas and form a solution. There are two parts to the solution I found. The first is to replace the strings that I am trying to filter out with a random text that doesn't appear in any instance in my data. Since I had a range of data I needed to replace (DS11 through DS19), I used a VBA function to avoid a large nested function.
Once I had the strings I am trying to filter out replaced, I added an If(Isnumber(search()) function to display "" when the replaced text is returned.
Function REPLACETEXTS(strInput As String, rngFind As Range, rngReplace As Range) As String
Dim strTemp As String
Dim strFind As String
Dim strReplace As String
Dim cellFind As Range
Dim lngColFind As Long
Dim lngRowFind As Long
Dim lngRowReplace As Long
Dim lngColReplace As Long
lngColFind = rngFind.Columns.Count
lngRowFind = rngFind.Rows.Count
lngColReplace = rngFind.Columns.Count
lngRowReplace = rngFind.Rows.Count
strTemp = strInput
If Not ((lngColFind = lngColReplace) And (lngRowFind = lngRowReplace)) Then
REPLACETEXTS = CVErr(xlErrNA)
Exit Function
End If
For Each cellFind In rngFind
strFind = cellFind.Value
strReplace = rngReplace(cellFind.Row - rngFind.Row + 1, cellFind.Column - rngFind.Column + 1).Value
strTemp = Replace(strTemp, strFind, strReplace)
Next cellFind
REPLACETEXTS = strTemp
End Function