Sum multiple rows based on duplicate column data without formula

Sum multiple rows based on duplicate column data without formula - excel

Based on data available in columns A to D (can be any 100's of columns), I want to sum up all the rows for column E to K (can be any 100's of columns)
The rows should sum up based on duplicate data from rows A to D, the result required as below
This is easily possible to do, with sumif, but would like to know if possible natively in excel or power query without creating unique id for each column or using sumif function or formula of any sort

In powerquery .. unpivot, group, pivot, done.
More detail:
Click select first 4 columns, right click, unpivot other columns
Click select first 4 columns and the new Attribute column, right click, group by
Use Operation:Sum on Column:Value name:count and hit OK
Click select Attribute column and transform .. pivot column... , for value column choose count
File Close and load
Full sample code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Code1", "Code2", "Code3", "Code4"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Code1", "Code2", "Code3", "Code4", "Attribute"}, {{"Count", each List.Sum([Value]), type number}}),
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[Attribute]), "Attribute", "Count", List.Sum)
in #"Pivoted Column"

To solve a problem like this, I first do a concrete example and then generalize it. I made a small table in Excel like so:
Code1
Code2
2-Jul-20
3-Jul-20
4-Jul-20
5-Jul-20
6-Jul-20
ERT
EXC
10
6
15
2
ERT
EXC
2
3
23
1
CON
HOR
3
CON
HOR
6
2
356
3
Then I clicked within the table and created a Power Query referencing it. After opening the Power Query Editor, there is a Group By function on the Home tab. It's pretty straightforward to choose the columns you want and the Sum function in a toy example like this.
Then, I opened the Advanced Editor to see what code was auto-generated. It looked something like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows orig" = Table.Group(Source, {"Code1", "Code2"}, {{"2-Jul-20", each List.Sum([#"2-Jul-20"]), type nullable number}, {"3-Jul-20", each List.Sum([#"3-Jul-20"]), type nullable number}, {"4-Jul-20", each List.Sum([#"4-Jul-20"]), type nullable number}, {"5-Jul-20", each List.Sum([#"5-Jul-20"]), type nullable number}, {"6-Jul-20", each List.Sum([#"6-Jul-20"]), type nullable number}})
in
#"Grouped Rows orig"
Typically, a Power Query expression is a series of transformations applied to a table, where each one operates on the table as returned from the previous. Here, we start with the original table as "Source" and then do the grouping. The parameters are a little messy, but what we have is: (1) the input table, (2) a list of the column names to group by, and (3) a list of 3-item lists, each of which describe an aggregated column. The sublists have the output column name, the function that does the aggregation, and the data type.
In Power Query, "each" is syntactic sugar for a single parameter function whose parameter is just an underscore. But also, when you have a record or row, you can just use [column] instead of _[column].
So how to generalize the operation you want to do? My first thought is that a convenient grouping function should have two parameters, based on your description. The first is the table to group, and the second is the number of columns starting from the left to group by. If you don't have them arranged contiguously, of course, you could do something else.
sumFromColumn = (t, n) => let
cList = Table.ColumnNames(t),
toGroup = List.FirstN(cList, n),
toSum = List.RemoveFirstN(cList, n),
sumFunc = (cName) => {cName, each List.Sum(Record.Field(_, cName)), type nullable number}
in Table.Group(t, toGroup, List.Transform(toSum, each sumFunc(_))),
#"Grouped Rows" = sumFromColumn(Source, 2), // Group by the first 2 columns and sum the rest
Here is the generalized function I made, which appears to match the original Table.Group operation that was generated by the interface.
The let statement arranges things for readability but does not imply a particular sequence that they happen in. Power Query figures out the dependencies and executes the statements in whatever order is needed.
The list of column names of the table is defined as cList, and split into toGroup and toSum. Then, sumFunc is defined as a function taking a column name and returning the 3-item list needed to define an aggregation operation. In Power Query, functions can return other functions any which way. So here we are defining a function that returns a list, with a function in it. Then we can use List.Transform to take the list of aggregated columns and turn it into the appropriate parameters for Table.Group.
Finally, the actual group by is done with a call like sumFromColumn(Source, 2), which is equivalent to the original statement that hard-codes the column names.
Code1
Code2
2-Jul-20
3-Jul-20
4-Jul-20
5-Jul-20
6-Jul-20
ERT
EXC
12
3
6
38
3
CON
HOR
6
5
356
3
This can easily be changed to sumFromColumn(Source, 1), in which case it will reduce to two rows, but then the second column being non-numeric, will become error values.
Or, you can use sumFromColumn(Source, 3), which will not add things up because the group by columns taken together are distinct.
This way you can easily aggregate any number of columns without caring about their names. I recommend both the Power Query M documentation on microsoft.com and reading about functional programming in general.

Related

Excel: divide text field in to separate columns

I have the following input table:
Sales Order
Asset Serial Number
Asset Model
Licence Class
License Type
License Name
Account Name
10000
1234, 5643, 3463
test-pro
A123
software
LIC-0002, LIC-0188, LIC-0188, LIC-0013
ABC
2000
5678, 9846, 5639
test-pro
A123
software
LIC-00107, LIC-08608, LIC-009, LIC-0610
ABC
Here the screenshot
I need it transformed into form:
.
I tried it first with the Replace function & transponate it but I didn't find a way to add the other empty columns other than do it manually.
My second thought was the text-to-column function, didn't work either.

Here two solutions one using Excel formulas and the other one using Power Query. See Explanation section for more information about each approach:
Excel
It is possible with excel without using Power Query, but several manipulations are required. On cell I2 put the following formula:
=LET(counts, BYROW(F2:F3, LAMBDA(a, LEN(a) - LEN(SUBSTITUTE(a, ",", "")))), del, "|",
emptyRowsSet, MAP(A2:A3, B2:B3, C2:C3, D2:D3, E2:E3, F2:F3, G2:G3, counts,
LAMBDA(a,b,c,d,e,f,g,cnts, LET(rep, REPT(";",cnts),a&rep &del& b&rep &del& c&rep &del&
d&rep &del& e&rep &del& SUBSTITUTE(f,", ",";") &del& g&rep ))),
emptyRowsSetByCol, TEXTSPLIT(TEXTJOIN("&",,emptyRowsSet), del, "&"),
byColResult, BYCOL(emptyRowsSetByCol, LAMBDA(a, TEXTJOIN(";",,a))),
singleLine, TEXTJOIN(del,,byColResult),
TRANSPOSE(TEXTSPLIT(singleLine,";",del))
)
Here is the output:
Update
A simplified version of previous formula is the following one:
=LET(counts, BYROW(F2:F3, LAMBDA(a, LEN(a) - LEN(SUBSTITUTE(a, ",", "")))), del, "|",
reps, MAKEARRAY(ROWS(A2:G3),COLUMNS(A2:G3), LAMBDA(a,b, INDEX(counts, a,1))),
emptyRowsSetByCol, MAP(A2:G3, reps, LAMBDA(a,b, IF(COLUMN(a)=6,
SUBSTITUTE(a,", ",";"), a&REPT(";",b)))),
byColResult, BYCOL(emptyRowsSetByCol, LAMBDA(a, TEXTJOIN(";",,a))),
singleLine, TEXTJOIN(del,,byColResult),
TRANSPOSE(TEXTSPLIT(singleLine,";",del))
)
Power Query
The following M Code provides the expected result:
let
Source = Excel.CurrentWorkbook(){[Name="TB_Sales"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sales Order", type text}}),
#"Split License Name" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"License Name",
Splitter.SplitTextByDelimiter(", ", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "License Name"),
ListOfColumns = List.Difference(Table.ColumnNames(#"Split License Name"), {"License Name"}),
RemainingColumns = List.Difference(Table.ColumnNames(#"Changed Type"), ListOfColumns),
RemoveDups = (lst as list) =>
let
concatList = (left as list, right as list) => List.Transform(List.Positions(left), each left{_}&"_"& right{_}),
prefixList = Table.Column(#"Split License Name", "Sales Order"),
tmp = concatList(prefixList, lst),
output = List.Accumulate(tmp, {}, (x, y) => x & {if List.Contains(x, y) then null else y})
in
output,
replaceValues = List.Transform(ListOfColumns, each RemoveDups(Table.Column(#"Split License Name", _))),
#"Added Empty Rows" = Table.FromColumns(
replaceValues & Table.ToColumns(Table.SelectColumns(#"Split License Name", RemainingColumns)),
ListOfColumns & RemainingColumns),
#"Extracted Text After Delimiter" = Table.TransformColumns(#"Added Empty Rows", {{"Sales Order",
each Text.AfterDelimiter(_, "_"), type text}, {"Asset Serial Number", each Text.AfterDelimiter(_, "_"), type text},
{"Asset Model", each Text.AfterDelimiter(_, "_"), type text}, {"Licence Class",
each Text.AfterDelimiter(_, "_"), type text}, {"License Type", each Text.AfterDelimiter(_, "_"), type text},
{"Account Name", each Text.AfterDelimiter(_, "_"), type text}}),
#"Reordered Columns" = Table.ReorderColumns(#"Extracted Text After Delimiter",{"Sales Order", "Asset Serial Number", "Asset Model",
"Licence Class", "License Type", "License Name", "Account Name"})
in
#"Reordered Columns"
And here is the output:
And the corresponding Excel Output:
Explanation
Here we provide the explanation for each approach: Excel formula and Power Query.
Excel Formula
We need to calculate how many empty rows we need to add based on License Name column values. We achieve that via counts name from LET:
BYROW(F2:F3, LAMBDA(a, LEN(a) - LEN(SUBSTITUTE(a, ",", ""))))
The output for this case is: {3;3}, i.e 2x1 array, which represents how many empty rows we need to add for each input row.
Next we need to build the set that includes empty rows. We name it emptyRowsSet and the calculation is as follow:
MAP(A2:A3, B2:B3, C2:C3, D2:D3, E2:E3, F2:F3, G2:G3, counts,
LAMBDA(a,b,c,d,e,f,g,cnts,
LET(rep, REPT(";",cnts),a&rep &del& b&rep &del& c&rep &del&
d&rep &del& e&rep &del& SUBSTITUTE(f,", ",";") &del& g&rep)))
We use inside MAP an additional LET function to avoid repetition of rep value. Because we want to consider the content of License Name as additional rows we replace the , by ; (we are going to consider this token as a row delimiter). We use del (|) as a delimiter that will serve as a column delimiter.
Here would be the intermediate result of emptyRowsSet:
10000;;;|1234, 5643, 3463;;;|test-pro;;;|A123;;;|software;;;|LIC-0002;LIC-0188;LIC-0188;LIC-0013|ABC;;;
2000;;;|5678, 9846, 5639;;;|test-pro;;;|A123;;;|software;;;|LIC-00107;LIC-08608;LIC-009;LIC-0610|ABC;;;
As you can see additional ; where added per number of items we have in License Name column per row. In the sample data the number of empty rows to add is the same per row, but it could be different.
The rest is how to accommodate the content of emptyRowsSet in the way we want. Because we cannot invoke TEXTSPLIT and BYROW together because we get #CALC! (Nested Array error). We need to try to circumvent this.
For example the following produces an error (#CALC!):
=BYROW(A1:A2,LAMBDA(a, TEXTSPLIT(a,"|")))
where the range A1:A2 has the following: ={"a|b";"c|d"}. We don't get the desired output: ={"a","b";"c","d"}. In short the output of BYROW should be a single column so any LAMBDA function that expands the columns will not work.
In order to do circumvent that we can do the following:
Convert the input into a single string joining each row by ; for example. Now we have column delimiter (|) and row delimiter (;)
Use TEXTSPLIT to generate the array (2x2 in this case), identifying the columns and the row via both delimiters.
We can do it as follow (showing the output of each step on the right)
=TEXTSPLIT(TEXTJOIN(";",,A1:A2),"|",";") -> 1) "a|b;c|d" -> 2) ={"a","b";"c","d"}
We are using the same idea here (but using & for joining each row). The name emptyRowsSetByCol:
TEXTSPLIT(TEXTJOIN("&",,emptyRowsSet), del, "&")
Would produce the following intermediate result, now organized by columns (Table 1):
Sales Order
Asset Serial Number
Asset Model
License Class
License Type
License Name
Account Name
10000;;;
1234, 5643, 3463;;;
test-pro;;;
A123;;;
software;;;
LIC-0002;LIC-0188;LIC-0188;LIC-0013
ABC;;;
2000;;;
5678, 9846, 5639;;;
test-pro;;;
A123;;;
software;;;
LIC-00107;LIC-08608;LIC-009;LIC-0610
ABC;;;
Note: The header are just for illustrative purpose, but it is not part of the output.
Now we need to concatenate the information per column and for that we can use BYCOL function. We name the result: byColResult of the following formula:
BYCOL(emptyRowsSetByCol, LAMBDA(a, TEXTJOIN(";",,a)))
The intermediate result would be:
Sales Order
Asset Serial Number
Asset Model
License Class
License Type
License Name
Account Name
10000;;;;2000;;;
1234, 5643, 3463;;;;5678, 9846, 5639;;;
test-pro;;;;test-pro;;;
A123;;;;A123;;;
software;;;;software;;;
LIC-0002;LIC-0188;LIC-0188;LIC-0013;LIC-00107;LIC-08608;LIC-009;LIC-0610
ABC;;;;ABC;;;
1x7 array and on each column the content already delimited by ; (ready for the final split).
Now we need to apply the same idea as before i.e. convert everything to a single string and then split it again.
First we convert everything to a single string and name the result: singleLine:
TEXTJOIN(del,,byColResult)
Next we need to do the final split:
TRANSPOSE(TEXTSPLIT(singleLine,";",del))
We need to transpose the result because SPLIT processes the information row by row.
Update
I provided a simplified version of the initial approach which requires less steps, because we can obtain the result of the MAP function directly by columns.
The main idea is to treat the input range A2:G3 all at once. In order to do that we need to have all the MAP input arrays of the same shape. Because we need to take into account the number of empty rows to add (;), we need to build this second array of the same shape. The name reps, is intended to create this second array as follow:
MAKEARRAY(ROWS(A2:G3),COLUMNS(A2:G3),
LAMBDA(a,b, INDEX(counts, a,1)))
The intermediate output will be:
3|3|3|3|3|3|3
3|3|3|3|3|3|3
which represents a 2x7 array, where on each row we have the number of empty rows to add.
Now the name emptyRowsSetByCol:
MAP(A2:G3, reps,
LAMBDA(a,b, IF(COLUMN(a)=6, SUBSTITUTE(a,", ",";"),
a&REPT(";",b))))
Produces the same intermediate result as in above Table 1. We treat different the information from column 6 (License Name) replacing the , with ;. For other columns just add as many ; as empty rows we need to add for each input row. The rest of the formula is just similar to the first approach.
Power Query
#"Split License Name" is a standard Power Query (PQ) UI function: Split Column by Delimiter.
To generate empty rows we do it by removing duplicates elements on each column that requires this transformation, i.e. all columns except License Name. We do it all at once identifying the columns that require such transformation. In order to do that we define two lists:
ListOfColumns: Identifies the columns we are going to do the transformation, because we need to do it in all columns except for License Name. We do it by difference via the PQ function: List.Difference().
RemainingColumns: To build back again the table, we need to identify the columns don't require such transformation. We use same idea via List.Difference(), based on ListOfColumns list.
The user defined function RemoveDups(lst as list) does the magic of this transformation.
Because we need to remove duplicates, but having unique elements based on each initial row, we use the first column Sales Order as a prefix, so we can "clean" the column within each partition.
In order to do that we define inside of RemoveDups() function a new user defined function concatList() to add the first column as prefix.
concatList = (left as list, right as list) =>
List.Transform(List.Positions(left), each left{_}&"-"& right{_}),
we concatenate each element of the lists (row by row) using a underscore delimiter (_). Later we are going to use this delimiter to remove the first column as prefix added at this point.
To remove duplicates and replace them with null we use the following logic:
output = List.Accumulate(tmp, {}, (x, y) =>
x & {if List.Contains(x, y) then null else y})
where tmp is a modified list (lst) with the first column as prefix.
Now we invoke the List.Transform() function for all the columns that require the transformation using as transform (second input argument) the function we just defined previously:
replaceValues = List.Transform(ListOfColumns, each
RemoveDups(Table.Column(#"Split License Name", _))),
#"Added Empty Rows" represents the step of this calculation and the output will be the following table:
The step #"Extracted Text After Delimiter" is just to remove the prefix we added and for that we use standard PQ UI Transform->Extract->Text After Delimiter.
Finally we need to reorder the column to put in a way it is expected via the step: #"Reordered Columns" using PQ UI functionality.

How to convert categorical values into columns in Excel?

I am working with a dataset that is structured like the one below. As you can see, the indicator column contains binary categorical data.
country_code indicator cumulative_count
AFG cases 52909
AFG deaths 2230
... ... ...
I would like to turn the indicator column into two separate columns (corresponding with the values of indicator: cases and deaths). I.e. I'm expecting the final result to be like this:
country_code cases deaths
AFG 52909 2230
... ... ...
Notes:
The original dataset is publically accessible from ECDC website.
I am only interested in the cumulative_count of one specific year_week (2020-53).
Here is a screenshot of the dataset:

This can also be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)
To use Power Query
Load your data table into Excel
Select some cell in your Data Table
Data => Get&Transform => from Table/Range or from within sheet
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
let
//Read in the table
//Change table name in next line to your actual table name
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//Remove the unneeded columns
#"Removed Other Columns" = Table.SelectColumns(Source,{"country_code", "indicator", "year_week", "cumulative_count"}),
//Set the data types for those columns
#"Set Data Type" = Table.TransformColumnTypes(#"Removed Other Columns",{
{"country_code", type text}, {"indicator", type text},{"year_week", type text},{"cumulative_count", Int64.Type}
}),
//Pivot the Indicator column and aggregate by Sum
#"Pivoted Column" = Table.Pivot(#"Set Data Type",
List.Distinct(#"Removed Other Columns"[indicator]), "indicator", "cumulative_count", List.Sum),
//Filter to show only the relevant year-week for rows where thiere is a country_code
// (the others refer to continents)
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([country_code] <> null) and ([year_week] = "2020-53"))
in
#"Filtered Rows"
filtered to show just 2020-53

If I'm understanding your question correctly. one way:
Add new column F
Formula in $F$2: sumifs($D2:$D$9999, $B2:$B$9999, $B2, $E2:$E$9999, "deaths")
copy formula down through end record
filter column E for "cases"
if you then insert rows above the header row, you can use Subtotal(109, ...) to view cumulative counts for a specific year, or alternatively add another column with Sumif as shown above

How to combine multiple columns from a table

My issue is the following: I have a table where I have multiple columns that have date and values but represent different things. Here is an example for my headers:
I Customer name I Type of Service I Payment 1 date I Payment 1 amount I Payment 2 date I Payment 2 amount I Payment 3 date I Payment 3 amount I Payment 4 date I Payment 4 amount I
What I want to do is sumifs the table based on multiple criteria. For example:
I Type of Service I Month 1 I Month 2 I Month 3 I Month 4
Service 1
Service 2
Service 3
The thing is that I do not want to write 4 sumifs (in this case, but in fact I have more that 4 sets of date:value columns).
I was thinking of creating a new table where I could put all the columns below each other (in one table with 4 columns - Customer name, Type of Service, Date and Payment) but the table should be dynamically created, meaning that it should be expanded dynamically with the new entries in the original table (i.e. if the original table has 200 entries, this would make the new table with 4x200=800 entries, if the original table has one more record then the new table should have 4x201=804 records).
I also checked the PowerQuery option but could not get my head around it.
So any help on the matter will be highly appreciated.
Thank you.

You can certainly create your four column table using Power Query. However, I suspect you may be able to also generate your final report using PQ, so you could add that to this code, if you wish.
And it will update but would require a "Refresh" to do the updating.
The "Refresh" could be triggered by
User selecting the Data/Refresh option
A button on the worksheet which user would have to press.
A VBA event-triggered macro
In any event, in order to make the query adaptable to different numbers of columns requires more M-Code than can be generated from the UI, a well as a custom function.
The algorithm below depends on the data being in this format:
Columns 1 and 2 would be Customer | Type of Service
Remaining columns would alternate between Date | Amount and be Labelled: Payment N Date | Payment N Amount where N is some number
If the real data is not in that format, some changes to the code may be necessary.
To use Power Query:
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
To enter the Custom Function, while in the PQ Editord
Right click in the Queries Pane
Add New Query from Blank Query
Paste the custom function code into the Advanced Editor
rename the Query fnPivotAll
M Code
let
//Change Table name in next line to be the Actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table8"]}[Content],
/*set datatypes dynamically with
first two columns as Text
and subsequent columns alternating as Date and Currency*/
textType = List.Transform(List.FirstN(Table.ColumnNames(Source),2), each {_,Text.Type}),
otherType = List.RemoveFirstN(Table.ColumnNames(Source),2),
dateType = List.Transform(
List.Alternate(otherType,1,1,1), each {_, Date.Type}),
currType = List.Transform(
List.Alternate(otherType,1,1,0), each {_, Currency.Type}),
colTypes = List.Combine({textType, dateType, currType}),
typeIt = Table.TransformColumnTypes(Source,colTypes),
//Unpivot all except first two columns
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(typeIt, List.FirstN(Table.ColumnNames(Source),2), "Attribute", "Value"),
//Remove "Payment n " from attribute column
remPmtN = Table.TransformColumns(#"Unpivoted Other Columns",{{"Attribute", each Text.Split(_," "){2}, Text.Type}}),
//Pivot on the Attribute column without aggregation using Custom Function
pivotAll = fnPivotAll(remPmtN,"Attribute","Value"),
typeIt2 = Table.TransformColumnTypes(pivotAll,{{"date", Date.Type},{"amount", Currency.Type}})
in
typeIt2
Custom Function: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
Sample Data
Output
If this does not give you what you require, or if you have issues going further with it to generate your desired reports, post back.

How to use Countifs,Or and Sumproduct efficiently

I have a list of accounts with 2 digit modifiers. Some accounts will have more then one modifier. I am looking for accounts with a certain combinations of modifiers.
So I have a list of accounts in the B column.
I have the modifiers in C Column
Example
Act # Modifier
111 80
111 56
111
222 55
222
333 51
333 50
333
I have some working code that works great until I get to many rows.
In this sample formula I have 8 Modifier groups.
50,22,51,62
51,22,62
54,50,51
55,50,51
56,50,51
80,50,51
"AS",50,51
59,50
=IF(OR(SUMPRODUCT(COUNTIFS(B:B,B3,C:C{50,22,51,62}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{51,22,62}))>=2,SUMPRODUCT(COUNTIFS(B:{54,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{55,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{56,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{80,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{"AS",50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{59,50}))>=2),"Check","")
This code will put check by any account that has 2 or more of the modifiers from any of the 8 groups. It has to be 2 modifiers from the same group though.
I was just wondering if there is a better way to write this? Instead of doing all these or can I just do OR for the different modifier criteria I am looking for?
Something like
=COUNTIFS(H:H,H5,I:I,OR({59,50},{"AS",50,51}))

As requested by #SkysLastChance, I will post my solution using Power Query (PQ) even though the question was tagged to Excel-Formula.
Please note you MUST use Excel 2010 or later versions otherwise you will not be able to use Power Query. My answer might not be robust enough for people who has not used PQ before. So feel free to leave a question if you are unclear with any particular step.
Step 1
Convert the Account List and Modifier Group in the example into Table in your excel worksheet. One way of doing that is to highlight the data including headers and press Ctrl+T. Then you should get two tables as shown below. I have named the first table as Tbl_ActList, and named the second one as Tbl_MoGrp.
Please note I have added some data to the Account List table for result testing purpose.
Step 2
Select any cell within a table, go to the Data tab on top of your excel (mine is Excel 2016), click From Table in the Get & Transform section. It will load and add the table to the built-in PQ Editor. You can exit the editor (and keep the changes), and repeat this step to add the second table to the PQ Editor. Alternatively you can add a new query in the PQ Editor and find the second table from your excel worksheet. I will not demonstrate this process as you can google the know-how later on.
Step 3
Once you have added both tables to the editor, you can start editing/transforming data in each table/query using built-in functions and/or advanced coding. In this case I only used built-in functions.
For the Modifier Group table, I want to transform the original data into a 2-Column list with one column showing which Group the modifier belongs to, and the other column showing a single modifier.
Firstly, use the Split Column function in the Transform tab to split the original modifier groups into single value by using , (comma) as the delimiter.
The new table is in matrix structure which is no ideal for look up purpose, so I used Unpivot Columns function in the Transform tab to convert it into list structure. What I actually did is to highlight the Grp column and select Unpivot Other Columns to get the list. Alternatively you can highlight the first four columns and use Unpivot Columns to get the same list.
Then I renamed Value column as Modifier, and removed the Attribute column to end up a 2-Column table.
Please note all data in each table/query in this example have been set to 'Text' format (aka data type). Data type is very sensitive and specific in PQ, and incorrect data type may lead to error.
Here is the full code behind the scene. All steps are performed using the built-in functions without any advanced coding:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_MoGrp"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Modifier", type text}, {"Grp", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Modifier", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Modifier.1", "Modifier.2", "Modifier.3", "Modifier.4"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Modifier.1", type text}, {"Modifier.2", type text}, {"Modifier.3", type text}, {"Modifier.4", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Grp"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Value", "Modifier"}}),
#"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"Attribute"})
in
#"Removed Columns"
Step 4
With the Modifier Group list ready, we can look up the modifier group in the Account List table for each modifier using Merge Queries function in the Home tab. The logic is to find the link between two tables to conduct a look up.
Firstly, select/highlight the column (Modifier) that contains the look up value in the origin table (Tbl_ActList), then select the table (Tbl_MoGrp) that you want to look up from, then select/highlight the corresponding column (Modifier) in the second table, and then click OK to continue.
Please note before merging I have filtered the Modifier column in the Account List table to get rid of cells showing null (blank) as they are not useful for the look up.
After merging the queries there is a new column added to the Account List table. It may look like a column but it contains all data from the Modifier Group table stored in Grp column and Modifier column. As we want to look up the modifier group only, we can Expand the column to show the Grp column only.
Click on the little square box on the right hand side of the header of the last column to trigger the Expand function, then select the Grp column only and click OK to continue.
Now we have a table showing account number, modifier, and modifier group. We can then use the Group By function in the Home tab to find out for each account number how many modifiers have appeared in each applicable modifier group.
Please See below screenshot for the settings for the Group By function.
Then I sorted the table ascending by Acc # column, and filtered the Count column to show values greater than or equal to 2, i.e. at least 2 modifies linked to that account number have appeared in a modifier group.
Here is the full code behind the scene:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_ActList"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Act #", type text}, {"Modifier", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Modifier] <> null)),
#"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"Modifier"}, Tbl_MoGrp, {"Modifier"}, "Tbl_Grp", JoinKind.LeftOuter),
#"Expanded Tbl_Grp" = Table.ExpandTableColumn(#"Merged Queries", "Tbl_Grp", {"Grp"}, {"Grp"}),
#"Grouped Rows" = Table.Group(#"Expanded Tbl_Grp", {"Act #", "Grp"}, {{"Count", each Table.RowCount(_), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Act #", Order.Ascending}}),
#"Filtered Rows1" = Table.SelectRows(#"Sorted Rows", each [Count] >= 2)
in
#"Filtered Rows1"
Step 5
The answer could stop at Step 4 as the table has shown the account number that we are looking for. However if there are thousands of account numbers, then it is better to Remove Other Columns except the Act # column, and Remove Duplicates within the column, and then Close & Load the result to a new worksheet. The final result may look like this:
A tip here, before Close & Load any query for the first time, it is better to set the following in your Query Options. It will prevent PQ Editor to load each of your queries to a separate worksheet by default. Just imaging how long it will take if you have 20 queries in your PQ Editor and each of them have more than a thousand lines of data.
Once you change the default option, PQ Editor will only create connections for your queries after you click Close & Load, and you can manually load a specific query result to a worksheet as shown below:
Conclusion
I believe if this question was tagged as a PowerQuery, there may be more concise or 'fancier' answers than mine. Regardless, the things that I like PQ the most are it is a built-in function of excel (2010 and later versions), it is scalable, replicable and more powerful when it comes to data cleansing and transforming.
Cheers :)

How can I have the informations from 2 differents columns represented as 1 in a pivot table?

In my data, I have 2 columns who represent a country visited before and a country visited after the cities that I am studying.
Here's a picture of my data sample: https://i.imgur.com/kS4K9uK.png
I'd like to represent in my pivot table all the countries linked to each city (so before and after the city). I'd like to have the cities in my line and all the countries who can possibly be visited before and after as my columns and the count of those in my values.
Here is a picture of what I'd like to achieve, but I can only do it for one of the columns (country after in that case). I'd like the same format but having the data of both before and after (but it's important to know that it's not necessarily the same countries in the 2 columns so I can't just have one of the country columns as the head and both as the values): https://i.imgur.com/PUjhSmB.png
When I place the cities in the line and the 2 country columns in value and columns, it is so difficult to read the table as the before and after are all separate and might even be counted as a pair. and if they are not in the pivot table column they only give me the count of countries before and after but not by the countries, which is not what I'm looking for.
Here is a picture of the result of the pivot table: https://i.imgur.com/3j4BD3k.png
I also tried to create a new field by doing «Country before» + «Country after» but it doesn't seem to work as the data is in text.

Ok I think understand the output now. You essentially want a count of the number of occurrences of each country in columns B+C, grouped by the city. I'll provide a few ways so you can select what suits you best.
Simplest method
The easiest way I can think of is simply paste the second column under the first column and then pivot on this new table.
COUNTIF
A more repeatable way would be to essentially make your own pivot table and use the COUNTIF function to count the instances of each country.
=COUNTIFS($A$1:$A$6,$F2,$B$1:$B$6,G$1)+COUNTIFS($A$1:$A$6,$F2,$C$1:$C$6,G$1)
Power Query
The most repeatable way is to use PowerQuery. This will enable you to refresh the data at the click of a button. To do this (assuming you have excel 2016) go to the Data tab and, with you data selected click "From Table/Range". The Power Query window will open. On the top left of the screen will be a button with advanced editor. When you open it you'll see the following code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"City", type text}, {"Country Before", type text}, {"Country After", type text}})
in
#"Changed Type"
Replace the code with the following code. Note that your table may be called something different. You can see what it's called on the second line of the code. The code below uses "Table1" - you can replace this with the name of your table.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"City", type text}, {"Country Before", type text}, {"Country After", type text}}),
#"Before" = Table.RemoveColumns(#"Changed Type",{"Country Before"}),
#"After" = Table.RemoveColumns(#"Changed Type",{"Country After"}),
#"Append" = Table.Combine({#"Before",#"After"}),
#"Inserted Merged Column" = Table.AddColumn(Append, "Country", each Text.Combine({[Country After], [Country Before]}, ""), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Country After", "Country Before"})
in
#"Removed Columns"
Hope that helps.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Sum multiple rows based on duplicate column data without formula - excel

Related

Excel: divide text field in to separate columns

How to convert categorical values into columns in Excel?

How to combine multiple columns from a table

How to use Countifs,Or and Sumproduct efficiently

How can I have the informations from 2 differents columns represented as 1 in a pivot table?

Categories

Resources