VBA change array data with with If statement - excel

Mission: Calculate matching cable types and check to see how many reels I need depending on the length, qty and reel length.
Currently: My code will check the reel length on the appropriate line but it check it against all of the different types of cables.
How Can I: Change the array fill with only matching items?
I am checking to see if C = J and D = K and E = L. This is because left side of the sheet are individual lengths and the right are the total lengths.
rInpStk is the Reel length total.
'Fill array with cable lengths
For i = 0 To UBound(CutArr, 1)
Dim x As Integer
For x = 7 To lastrow 'Commenting out this If will get me the same result
If Cells(x, 3).Value = rInpStk.Offset(, -5) And _
Cells(x, 4).Value = rInpStk.Offset(, -4) And _
Cells(x, 5).Value = rInpStk.Offset(, -3) Then 'This does nothing
CutArr(i, 0) = rInputCuts.Cells(i + 1, 2) 'I want these to only populate the same cable types
CutArr(i, 1) = rInputCuts.Cells(i + 1, 1) 'If there is not a match then dont add that rows data to the array
Exit For
End If
Next x
Next i
Do I need to do an AutoFilter? If so how would I implement that.
After it's done I want it to continue to the next cable type.
Any help is much appreciated.

If your goal is to create the table you show under your TOTALS header, you can do this easily using Power Query which has been available since Excel 2010.
And if your data changes in your first table, a simple 'Refresh' will update the results table, and it will automatically adjust for changes in the amount of data, or types, in your first table.
Create a custom column of Total Length which is the product of your source data length*qty.
Then group by columns 3-7, with the SUM aggregate function.
Delete the unnecessary columns, and rearrange to get the desired column order.
This can all be done from the UI with the Custom Column formula of [LENGTH]*[QTY]
MCode
let
Source = Excel.CurrentWorkbook(){[Name="cableTbl"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LENGTH", Int64.Type}, {"QTY", Int64.Type}, {"TYPE", type text}, {"SIZE", type text}, {"COL", type text}, {"CU ID", Int64.Type}, {"MAX ID", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each [LENGTH]*[QTY]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"LENGTH", "QTY"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"TYPE", "SIZE", "COL", "CU ID", "MAX ID"}, {{"LENGTH", each List.Sum([Custom]), type number}}),
#"Reordered Columns" = Table.ReorderColumns(#"Grouped Rows",{"LENGTH", "TYPE", "SIZE", "COL", "CU ID", "MAX ID"})
in
#"Reordered Columns"
Results

Related

The Date Where There Is Enough Supply To Satisfy Dem

source link
I am trying to come up with a solution to the following problem.
Problem:
In my dataset I have certain quantity of item in demand (need), and purchase orders that re-supply that item(Supply). I need to determine for each demand , what is the first date where we will have enough supply to fill the demand.
For example, if we look at our 1st demand, which require 5 units, according to the cumulative Sum column, 18/12/23 will be the first date when we would have enough qty supplied to satisfy the first demand. The problem appears when we have more the one demand for an item.
If we stay with same item What I would like to do is to update the cumulative Sum when we meet the enough quantity ( as cumulative Sum = cumulative Sum- qty(demand) or 6(cumulative supply)-5(demand) = 1 ) so the cumulative Sum for the next demand will be 100 +1 = 101 and not 100 + 6 = 106. Thereby we can simply rely on the cumulative Sum (updated) to retrieve the first date where we will have enough supply to fill the demand.
I'm not sure if something like this is possibly in Power Query but any help is greatly appreciated.
Hopefully that all makes sense. Thx.
Revised
In powerquery try this as code for Demand
let Source = Excel.CurrentWorkbook(){[Name="DemandDataRange"]}[Content],
#"SupplyGrouped Rows" = Table.Group(Supply, {"item"}, {{"data", each
let a = Table.AddIndexColumn( _ , "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"SupplyExpanded data" = Table.ExpandTableColumn(#"SupplyGrouped Rows", "data", { "Supply date", "CumTotal"}, {"Supply date", "CumTotal"}),
#"Grouped Rows" = Table.Group(Source, {"item"}, {{"data", each
let a= Table.AddIndexColumn(_, "Index", 0, 1),
b=Table.AddColumn(a, "CumTotal", each List.Sum(List.FirstN(a[Qty],[Index]+1)))
in b, type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Qty", "Date", "Index", "CumTotal"}, {"Qty", "Date", "Index", "CumTotal"}),
x=Table.AddColumn(#"Expanded data","MaxDate",(i)=>try Table.SelectRows( #"SupplyExpanded data", each [item]=i[item] and [CumTotal]>=i[CumTotal] )[Supply date]{0} otherwise null, type date ),
#"Removed Columns" = Table.RemoveColumns(x,{"Index", "CumTotal"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Date", type date}})
in #"Changed Type"
Given my understanding of what you want for results, the following Power Query M code should return that.
If you just want to compare the total supply vs total demand, then only check the final entries instead of the first non-negative.
Read the code comments, statement names and explore the Applied Steps to understand the algorithm.
let
//Read in the data tables
//could have them in separate querries
Source = Excel.CurrentWorkbook(){[Name="Demand"]}[Content],
Demand = Table.TransformColumnTypes(Source,{{"item", type text}, {"Qty", Int64.Type}, {"Date", type date}}),
//make demand values negative
#"Transform Demand" = Table.TransformColumns(Demand,{"Qty", each _ * -1}),
Source2 = Excel.CurrentWorkbook(){[Name="Supply"]}[Content],
Supply = Table.TransformColumnTypes(Source2,{{"item", type text},{"Qty", Int64.Type},{"Supply date", type date}}),
#"Rename Supply Date Column" = Table.RenameColumns(Supply,{"Supply date","Date"}),
//Merge the tables and sort by Item and Date
Merge = Table.Combine({#"Rename Supply Date Column", #"Transform Demand"}),
#"Sorted Rows" = Table.Sort(Merge,{{"item", Order.Ascending}, {"Date", Order.Ascending}}),
//Group by Item
//Grouped running total to find first positive value
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"item"}, {
{"First Date", (t)=> let
#"Running Total" = List.RemoveFirstN(List.Generate(
()=>[rt=t[Qty]{0}, idx=0],
each [idx]<Table.RowCount(t),
each [rt=[rt]+t[Qty]{[idx]+1}, idx=[idx]+1],
each [rt]),1),
#"First non-negative" = List.PositionOfAny(#"Running Total", List.Select(#"Running Total", each _ >=0), Occurrence.First)
in t[Date]{#"First non-negative"+1}, type date}})
in
#"Grouped Rows"
Supply
Demand
Results
I did this in Excel formula rather than using powerquery - there will be a powerquery equivalent but I'm not very fluent in DAX yet.
You need a helper column - could do without it but everything's much more readable if you have it.
In sheet Supply (2), cell E2, enter the formula:
=SUMIFS(Supply!B:B; Supply!C:C;"<=" & C2;Supply!A:A;A2)-SUMIFS(Dem!B:B;Dem!C:C;"<=" & C2;Dem!A:A;A2)
and copy downwards. This can be described as Total supply up to that date minus total demand up to that date. In some cases this will be negative (where there's more demand than supply).
Now you need to find the date of the first non-negative value for that.
First create a unique list of the items - I put it on the same sheet in the range G2:G6. Then in H2, the formula:
=MINIFS(C:C;A:A;G2;E:E;">=" & 0)
and copy downwards.

How can I pivot a column in Power Query and keep order?

Original data:
I want to transform them like this:
I tried to pivot it in Power Query. But the order is not correct. The column with empty value would fill up:
Since your Measurement ID's are numeric and sequential within each series
Add a 1-based index column.
Then add a custom column
Formula = [Index]-[Measurement ID]
If the ID sequence is broken, the formula will return a different result.
If the Measurement ID's in your actual data do not fit that pattern, it should be relatively easy to create an equivalent index that does match that pattern, and then use the same algorithm
Now, when you Pivot, you will get your desired outcome.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Measurement ID", Int64.Type}, {"Measurement Result", type number}}),
#"Added Index" = Table.AddIndexColumn(
#"Changed Type", "Index", 1, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
each [Index]-[Measurement ID]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {
{"Measurement ID", type text}}, "en-US")[#"Measurement ID"]), "Measurement ID", "Measurement Result"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Custom"})
in
#"Removed Columns1"
If your Measurement ID column is not in the designated pattern
I make the assumption that each Series starts with the first ID in the column.
To create our Custom series, we can then use (after inserting the Index column),
a formula that returns an Index number if the value in the ID column is the same as the first, otherwise return a null
Then 'Fill Down'
#"Added Custom" = Table.AddColumn(#"Added Index", "sequence",
each if [Measurement ID] = #"Added Index"[Measurement ID]{0} then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"sequence"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
It looks like you expect Power Query to implicitly know that Measurement ID 4 belongs to a 2nd set of data?
It won't do that for you unless you specify whether each measurement belongs to a 1st, 2nd or 3rd set.
You could:
Write the set IDs in manually to a new column
Calculate them programatically e.g New column with value that increments +1 whenever the current measurement ID is less than the previous measurement ID
Go back to the source data and check if you can have Measurement ID 4 = null in the 1st and 3rd sets.
For instance, with the third option your table would perhaps resemble:
Set
ID
Result
1
1
a
1
2
b
1
3
c
1
4
null
2
1
d
2
2
e
2
3
f
2
4
g
3
1
h
3
2
i
3
3
j
3
4
null
There isn't enough information about your data, therefore the details & the correct solution need to be left to you.

Calculate time for each person to stays in the factory

There are dates in the cell and times of entering and leaving the factory. I want to calculate how many hours each person has stay in the day they come to the factory. For this, I wrote a macro like this and I defined each person as sicil_no , but since there are multiple entries and exits at different times on the same date, I need to determine the last and first exit times for each day and subtract them. I didnt figure out how to do the last part
Sub macro()
Dim sicil_no As String
Dim i As Integer
Dim end_row As Long
Dim dates As Range
Dim gecis_yonu As String
Dim entry As String
Dim Exits As String
end_row = Cells(Rows.Count, 3).End(xlUp).Row
For i = 3 To end_row
sicil_no = Cells(i, 3).Value
dates = Cells(i, 1).Value
If Range("J", i).Value = "Exit" Then
Range("J", i).Value = exist
End If
If Range("J", i).Value = "Entry" Then
Range("J", i).Value = entry
End If
Next
For Each dates In Range("A", end_row)
Range("M", i).Value = exist - entry
Next
End Sub
One possible way is to use MAXIFS and MINIFS formula to get this result:
It can probably be done better, but if you select A:H and remove duplicates and uncheck column A then you get the result you are looking for I believe.
This assumes the date in column A is a true date and not just a text. If it's not a date then you will need to make it a date.
This can be done using DATEVALUE and RIGHT, LEFT, MID to make the string an accepted date format.
Then in E column you add this formula
=TEXT(A2,"YYYY-MM-DD")
In F:
=MAXIFS(A:A,E:E,TEXT(A2,"YYYY-MM-DD"),B:B,B2)
In G:
=MINIFS(A:A,E:E,TEXT(A2,"YYYY-MM-DD"),B:B,B2)
And lastly in H:
=F2-G2
When all formulas are on the sheet, select everything and copy, paste as values, then use remove duplicates like this:
and the result is this:
EDIT:
For completeness, this is how you convert your date to an accepted date format.
In M2 (example):
=MID(A2,7,4)&"-"&MID(A2,4,2)&"-"&LEFT(A2,2)&" "&RIGHT(A2,8)
then we need to use DATEVALUE and TIMEVALUE on this cell
N2:
=DATEVALUE(M2)+TIMEVALUE(M2)
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
You did not show what you want for output, but you can add to what I have shown which is the bare minimum Sicil, Date and Time between earliest and latest times. (Assuming each pair of times is entry/exit, you could also sum the differences between each pair of times per day)
In the Query, you can sort the results depending on whether you want to show by date or by employee.
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Add custom column with just the Date part for grouping
#"Added Custom" = Table.AddColumn(Source, "Date", each Date.From([Dates])),
//Group by Sicil No and Date
//Then extract the time in Factory as the last time less the first time
#"Grouped Rows" = Table.Group(#"Added Custom", {"Sicil No", "Date"}, {
{"Hrs in Factory", each List.Max([Dates]) - List.Min([Dates]), type duration}
}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Date", type date}})
in
#"Changed Type"
Edit
If you want to add up the actual time in the factory per day, taking into account the entry/exit times:
Assuming times are entered as pairs, where the first time is entry and the second is exit
Merely subtract one from the other to get each duration
The group as above and add the total durations per Sicil and Date
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Add custom column with just the Date part for grouping
#"Added Custom" = Table.AddColumn(Source, "Date", each Date.From([Dates])),
//Add Index column to access previous row
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 0, 1, Int64.Type),
//if the Index number is an Odd number,
// then subtract the previous row from the current row to get the Duration
#"Added Custom1" = Table.AddColumn(#"Added Index", "Duration", each
if Number.Mod([Index],2)=0
then null
else [Dates]- Table.Column(#"Added Index","Dates"){[Index]-1}),
//Group by Sicil and Date
// SUM the durations
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Sicil No", "Date"}, {
{"Time in Factory", each List.Sum([Duration]), type nullable duration}}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Sicil No", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
further modification to account for "real" list not being sorted as needed, and also data errors with mismatch of entry/exitsAlso different routine to refer to previous row for speed improvements
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Change type especially datetime to Turkish culture (since I am in US)
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"GECIS TARIHI", type datetime}, {"KART NUMARASI", type any}, {"SICIL NUMARASI", Int64.Type}, {"SOYADI", type text},
{"ADI", type text}, {"FİRMASI", type text}, {"GEÇİÇİ TAŞERON", type any}, {"BÖLÜM KODU", type any},
{"TERMINAL", type any}, {"GEÇİŞ YÖNÜ", type text}, {"GEÇİŞ DURUMU", type any}, {"ZONE", type any}}, "tr-TR"),
//Remove columns that will not appear in final report
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"KART NUMARASI", "SOYADI", "ADI", "FİRMASI", "GEÇİÇİ TAŞERON",
"BÖLÜM KODU", "TERMINAL", "GEÇİŞ DURUMU", "ZONE"}),
//Sort for proper processing
#"Sorted Rows" = Table.Sort(#"Removed Columns",{{"SICIL NUMARASI", Order.Ascending}, {"GECIS TARIHI", Order.Ascending}}),
//add shifted columns to reference previous rows for entry/exit and time
//much faster than using the Index column method
ShiftedList = {null} & List.RemoveLastN(Table.Column(#"Sorted Rows", "GEÇİŞ YÖNÜ"),1),
Custom1 = Table.ToColumns(#"Sorted Rows") & {ShiftedList},
Custom2 = Table.FromColumns(Custom1, Table.ColumnNames(#"Sorted Rows") & {"GEÇİŞ YÖNÜ" & " Prev Row"}),
ShiftedList1 = {null} & List.RemoveLastN(Table.Column(Custom2, "GECIS TARIHI"),1),
Custom3 = Table.ToColumns(Custom2) & {ShiftedList1},
Custom4 = Table.FromColumns(Custom3, Table.ColumnNames(Custom2) & {"GECIS TARIHI" & " Prev Row"}),
//Calculate duration on the appropriate rows
#"Added Custom" = Table.AddColumn(Custom4, "Time in Factory", each
if [GEÇİŞ YÖNÜ] = "Exit" and [GEÇİŞ YÖNÜ Prev Row] = "Entry"
then [GECIS TARIHI] - [GECIS TARIHI Prev Row]
else null),
//Filter out the unneeded rows
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Time in Factory] <> null)),
//Remove the offset columns
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"GEÇİŞ YÖNÜ Prev Row", "GECIS TARIHI Prev Row"}),
//add Date column for grouping
#"Added Custom1" = Table.AddColumn(#"Removed Columns1", "Date", each DateTime.Date([GECIS TARIHI]),Date.Type),
//Group by Date and Sicil and SUM the Time in Factdory
#"Grouped Rows" = Table.Group(#"Added Custom1", {"SICIL NUMARASI", "Date"}, {
{"Time in Factory", each List.Sum([Time in Factory]), type duration}
})
in
#"Grouped Rows"

Searching and returning for "*1*" in a string returns instances containing "*11*" as well in excel

I am attempting to extract cells through a combination of index(match) and right(len)-find() functions from an array of data with text. In my formula, I am searching for instances of "* DS#1 ", excel returns those but also returns instances with " DS#11 *". How do I get excel to return only DS#1?
I have attempted to use an if statement with no success, if(formula="* 11 *","",formula).
Below is a link to an example of the data. The first cell highlighted in yellow should not be returning that text, it should be "". The second cell highlighted in yellow is appropriate to return that data.
example data
=RIGHT(INDEX($V:$AC,MATCH("DS#1",$AC:$AC,0),1),LEN(INDEX($V:$AC,MATCH(FW$1,$AC:$AC,0),1))-FIND($AG2,INDEX($V:$AC,MATCH(FW$1,$AC:$AC,0),1))+1)
Here some example on how to find a value and check for the following char.
Formula in D2:
=INDEX(A2:A6,MATCH(1,INDEX((ISNUMBER(SEARCH("DS#1",B2:B6)))*(NOT(ISNUMBER(MID(B2:B6,SEARCH(C2,B2:B6)+LEN(C2),1)*1))),0),0))
Here is a formula you can adapt to your ranges which will return a list from the range rngDS that contain findDS. I used named ranges, but you can adapt them to your own ranges.
Not sure if this is what you want since you chose to not post examples of your data or desired results.
The routine finds the findDS string and then checks to be sure that the following character is non-numeric.
C1: =IFERROR(INDEX(rngDS,AGGREGATE(15,6,1/(NOT(ISNUMBER(-MID(rngDS,SEARCH(findDS,rngDS)+LEN(findDS),1))+ISERROR(MID(rngDS,SEARCH(findDS,rngDS)+LEN(findDS),1))))*ROW(rngDS),ROWS($1:1))),"")
and fill down
It would be very difficult to come up a formula based solution especially when you need to first differentiate DS1, DS#1. DS#11, DS#11X etc. then look for the text string after each DS code, not to mention these confusing codes may (or may not) be positioned in random orders in the text string.
A better approach would be using Power Query which is available in Excel 2010 and later versions. My solution is using Excel 2016.
Presume you have the following two tables:
You can use From Table function in the Data tab to add both tables to the Power Query Editor.
Once added, make a duplicate copy of Table 1. I have renamed the duplicate as Table1 (2) - Number Ref. Then you should have three un-edited queries:
If your source data is a larger table containing some other information, you can google how to add a worksheet to the editor and how to remove unnecessary columns and remove duplicated values.
Firstly, let's start working with Table1.
Here are the steps:
Use Replace Values function to remove all # from the text string, and then replace all DS with DS# in the text string, so all DS codes are in the format of DS#XXX. Eg. DS8 will be changed to DS#8. This step may not be necessary if DS8 is a valid code as well as DS#8;
Use Split Column function to split the text strings by the word DS, and put each sub text string into a new row, then you should have the following:
Use Split Column function again to split the text strings by 1 Character from the left and you should have the following:
Filter the first column to show hash tag # only and then remove the first column, then you should have the following:
use Replace Values function repeatedly to remove the following characters/symbols from the text strings: (, ), HT, JH, SK, //, and replace dash - with space . I presume these are irrelevant in the comment but you can leave them if needed. Then you should have:
use Split Column function again to split the text string by the first space on the left, then you should have:
Then you can Trim and Clean the second column to further tidy up the comments, rename the columns as DS#, Comments, and Number Ref consecutively, and change the format of the third column to Text. Then you should have:
The last step is to add a custom column called Match ID to combine the value from first and third column into one text string as shown below:
Secondly, let's work on Table1 (2) - Number Ref
Here are the steps:
Remove the first column so leave the Number Ref column as the single column;
Transpose the column, and Promote the first row as header. Then you should have:
The purpose of this query is to transform all Number Reference into column headers, then append this query with the next query (Table2) to achieve the desired result which I will explain in the next section.
Lastly, let's work on the third query Table2.
Here are the steps:
Append this table with the Number Ref table from previous step;
highlight the whole table and use Replace Values function to replace all null with number 1. Then highlight the first column and use Unpivot Other Columns function to transform the table as below:
then Remove the last column (Value), and add a new custom column called Match ID to combine the DS code with the Number Reference. Then you should have:
Merge the table with Table1 using Match ID as shown below:
Expand the newly merged column Table1 to show Comments;
Use Split Column function to split the first column by Non-digit to digit, change the format of the digit column to whole number, and then sort Attribute column and the digit column ascending consecutively, then you should have:
Use Split Column function again to split the Match ID column by dash sign -, and remove the first three columns, rename the remaining three columns as DS#, Number Ref and Comments consecutively, then you should have:
Close & Load this table to a new worksheet as desired, which may look like this:
In conclusion, It is entirely up to you how you would like to structure the table in Power Query. You can pre-filter the Number Reference in the editor and load only relevant results to a worksheet, you can load the full table to a worksheet and use VLOOKUP or INDEX to retrieve the data as desired, or you can load the third query to data model from where you can create pivot tables to play around.
Here are the codes behind the scene for reference only. All steps are using built-in functions of the editor without any advanced manual coding.
Table1
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Comments", type text}}),
#"Replaced Value8" = Table.ReplaceValue(#"Changed Type","#","",Replacer.ReplaceText,{"Comments"}),
#"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","DS","DS#",Replacer.ReplaceText,{"Comments"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value9", {{"Comments", Splitter.SplitTextByDelimiter("DS", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Comments"),
#"Split Column by Position" = Table.SplitColumn(#"Split Column by Delimiter", "Comments", Splitter.SplitTextByPositions({0, 1}, false), {"Comments.1", "Comments.2"}),
#"Filtered Rows" = Table.SelectRows(#"Split Column by Position", each ([Comments.1] = "#")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Comments.1"}),
#"Replaced Value1" = Table.ReplaceValue(#"Removed Columns",")","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","-"," ",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2","(","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3","HT","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4","JH","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5","SK","",Replacer.ReplaceText,{"Comments.2"}),
#"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6","//","",Replacer.ReplaceText,{"Comments.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Replaced Value7", "Comments.2", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Comments.2.1", "Comments.2.2"}),
#"Trimmed Text" = Table.TransformColumns(#"Split Column by Delimiter1",{{"Comments.2.2", Text.Trim, type text}}),
#"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Comments.2.2", Text.Clean, type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Cleaned Text",{{"Comments.2.1", "DS#"}, {"Comments.2.2", "Comments"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Number Ref", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Match ID", each "DS#"&[#"DS#"]&"-"&[Number Ref])
in
#"Added Custom"
Table1 (2) - Number Ref
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Comments", type text}, {"Number Ref", Int64.Type}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Number Ref"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Number Ref", type text}}),
#"Transposed Table" = Table.Transpose(#"Changed Type1"),
#"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
#"Changed Type2" = Table.TransformColumnTypes(#"Promoted Headers",{{"388", type any}, {"1", type any}})
in
#"Changed Type2"
Table2
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"List", type text}}),
#"Appended Query" = Table.Combine({#"Changed Type", #"Table1 (2) - Number Ref"}),
#"Replaced Value" = Table.ReplaceValue(#"Appended Query",null,"1",Replacer.ReplaceValue,{"List", "388", "1"}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Value", {"List"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Value"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Match ID", each [List]&"-"&[Attribute]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Match ID"}, Table1, {"Match ID"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Comments"}, {"Comments"}),
#"Split Column by Character Transition" = Table.SplitColumn(#"Expanded Table1", "List", Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9"}, c), {"0".."9"}), {"List.1", "List.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Character Transition",{{"List.2", Int64.Type}}),
#"Sorted Rows" = Table.Sort(#"Changed Type1",{{"Attribute", Order.Ascending}, {"List.2", Order.Ascending}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Sorted Rows", "Match ID", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"Match ID.1", "Match ID.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"List.1", type text}, {"Match ID.1", type text}, {"Match ID.2", type text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type2",{"Match ID.1", "Match ID.2", "Comments"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Match ID.1", "DS#"}, {"Match ID.2", "Number Ref"}})
in
#"Renamed Columns"
Cheers :)
Based on the creative answers from this post, I was able to take some of these ideas and form a solution. There are two parts to the solution I found. The first is to replace the strings that I am trying to filter out with a random text that doesn't appear in any instance in my data. Since I had a range of data I needed to replace (DS11 through DS19), I used a VBA function to avoid a large nested function.
Once I had the strings I am trying to filter out replaced, I added an If(Isnumber(search()) function to display "" when the replaced text is returned.
Function REPLACETEXTS(strInput As String, rngFind As Range, rngReplace As Range) As String
Dim strTemp As String
Dim strFind As String
Dim strReplace As String
Dim cellFind As Range
Dim lngColFind As Long
Dim lngRowFind As Long
Dim lngRowReplace As Long
Dim lngColReplace As Long
lngColFind = rngFind.Columns.Count
lngRowFind = rngFind.Rows.Count
lngColReplace = rngFind.Columns.Count
lngRowReplace = rngFind.Rows.Count
strTemp = strInput
If Not ((lngColFind = lngColReplace) And (lngRowFind = lngRowReplace)) Then
REPLACETEXTS = CVErr(xlErrNA)
Exit Function
End If
For Each cellFind In rngFind
strFind = cellFind.Value
strReplace = rngReplace(cellFind.Row - rngFind.Row + 1, cellFind.Column - rngFind.Column + 1).Value
strTemp = Replace(strTemp, strFind, strReplace)
Next cellFind
REPLACETEXTS = strTemp
End Function

Row totals based on column name in PowerQuery

I have a data file with around 400 columns in it. I need to import this data into PowerPivot. In order to reduce my file size, I would like to use PowerQuery to create 2 different row totals, and then delete all my unneeded columns upon load.
While my first row total column (RowTotal1) would summate all 400 columns, I would also like a second row total (RowTotal2) that subtracts from RowTotal1 any column whose name contains the text "click" in it.
Secondly, I would like to use the the value in my Country column as a variable, to also subtract any column that contains this var. e.g.
Site----Country----Col1----Col2----ClickCol1----Col3----Germany----RowTotal1----RowTotal2
1a--------USA----------2---------4-----------8------------16----------24--------------54---------------46-------
2a-----Germany-------2---------4-----------8------------16----------24--------------54---------------22-------
RowTotal1 = 2 + 4 + 8 + 16 + 24
RowTotal2 (first row) = 54 - 8 (ClickCol1)
RowTotal2 (second row) = 54 - 24 (Germany) - 8 (ClickCol1)
Is this possible? (EDIT: Yes. See answer below)
REVISED QUESTION: Is there a more memory efficient way to do than trying to group 300+ million rows at once?
Code would look something like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Site", type text}, {"Country", type text}, {"Col1", Int64.Type}, {"Col2", Int64.Type}, {"ClickCol1", Int64.Type}, {"Col3", Int64.Type}, {"Germany", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Country", "Site"}, "Attribute", "Value"),
#"Added Conditional Column" = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value] ),
#"Grouped Rows" = Table.Group(#"Added Conditional Column", {"Site", "Country"}, {{"RowTotal1", each List.Sum([Value]), type number},{"RowTotal2", each List.Sum([Value2]), type number}})
in
#"Grouped Rows"
But since you have a lot of columns, I should explain the steps:
(Assuming you have these in Excel file) Import them to Power Query
Select "Site" and "Country" columns (with Ctrl), right click > Unpivot Other Columns
Add Column with this formula (you might need to use Advanced Editor): Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Country] = [Attribute] or [Attribute] = "ClickCol1" then 0 else [Value])
Select Site and Country columns, Right Click > Group By
Make it look like this:

Resources