I have the following:
Each table represents a game (in this case of CS:GO).
What I want to do is get the sum of all kills, by all players, for each map, like:
Train: 208
Mirage: 103
I'm having some trouble with discriminating for each map. I can either do this in Google Sheets or in Excel.
=QUERY(QUERY(ARRAYFORMULA(SPLIT(TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN(" ", 1, B:B),
"Map", "♦"), "♦")), " ")),
"select Col1, Col3+Col4+Col5+Col6+Col7"),
"select Col1, sum(Col2) group by Col1 label sum(Col2)''")
Related
In the example below I want to search each unique order and then the items in that order. From that I would like to extract the most common items that are ordered together and how many times they occur together. This is just a sample. I am doing this with a file with 20,000 rows.
Sorry, I haven't earned enough points to embed the photo. It's in the link below.
Screenshot of the example
Use this formula to get the occurrences with one formula one cell.
=ArrayFormula({ "Occurrences",$B$1:$F$1;
QUERY({COUNTIF(
B2:B&C2:C&D2:D&E2:E&F2:F,
"="&QUERY({ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))}, " Select Col1 ")&
QUERY({ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))}, " Select Col2 ")&
QUERY({ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))}, " Select Col3 ")&
QUERY({ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))}, " Select Col4 ")&
QUERY({ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))}, " Select Col5 "))
}, "Select Col1 where Col1 <> 0 "),
ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F))) })
Option 02
=ArrayFormula({ "Occurrences",$B$1:$F$1;
QUERY({ARRAY_CONSTRAIN(COUNTIF(
FLATTEN(QUERY(TRANSPOSE(B2:F), "",9^9 )),
"="&FLATTEN(QUERY(TRANSPOSE(ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))), "",9^9 ))),
COUNTA(FLATTEN(QUERY(TRANSPOSE(ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F)))), "",9^9 ))),1)
}, "Select Col1 where Col1 <> 0 "),
ARRAY_CONSTRAIN(UNIQUE($B$2:$F),ROWS(UNIQUE($B$2:$F))-1,COLUMNS(UNIQUE($B$2:$F))) })
I hope that helped ^_^
Alternate Solution (with Helper Columns):
Though the other solution posted works I've figured this will not count it in the same combination if the items are interchanged. For example:
This will be counted as 1 for each even they are the same combination.
So here's another solution if you don't mind using helper columns:
1.) Use this formula in 1 column to combine all items in the order:
=TEXTJOIN(", ", TRUE, SORT(TRANSPOSE(E2:I2), 1, TRUE))
Drag down to column.
This uses SORT() function to first sort the items alphabetically before using TEXTJOIN() function to concatenate the items into one cell. This is so that it will not matter even if the items are interchanged.
2.) Use the UNIQUE() function to remove the duplicates.
=UNIQUE(K2:K15)
3.) Use the COUNTIF() to count the number of occurences. Then the IF() to only apply it for rows that are not blank. Then ArrayFormula() so there's no need to drag down the formula to the column you just need to input in the first row.
=ARRAYFORMULA(IF(L2:L<>"",COUNTIF(K2:K,L2:L),""))
Final Result:
Limitation:
This can't count as same combination if the total order is not the same. For example:
They will be counted as 1 each.
References:
Concatenate and Alphabetize
If Not Empty
try this and notice the blue cells:
=ARRAYFORMULA(TRIM(SPLIT(FLATTEN(QUERY(TRANSPOSE(QUERY(QUERY(QUERY(TRIM(SPLIT(FLATTEN(
QUERY(QUERY(IFERROR(SPLIT(FLATTEN(IF(E2:I="",,ROW(E2:I)&"♠♦"&PROPER(E2:I)&"♥")), "♦")),
"select max(Col2) where Col1 <> '♠' group by Col2 pivot Col1"),,9^9)), "♠")),
"select count(Col2),Col2 where Col2 is not null group by Col2 order by count(Col2) desc"),
"select Col1,'♥',Col2"), "offset 1", )),,9^9)), "♥")))
Solution with PowerQuery
You can add as many Item-Colums you want (Columnname must have the word "Item" in it -> "Item 6", "Item 7", "Last Item", "My Item", "Special Item" ...)
You do not have to adjust any range in a cell formula
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Inserted Merged Column" = Table.AddColumn(
Source,
"Combination",
each Text.Combine(
List.Sort(
List.Transform(
List.Select(
Table.ColumnNames(Source),
each Text.Contains(_,"Item")
)
, (col)=> Record.Field(_, col)
)
),
"; "), type text
),
#"Grouped Rows" = Table.Group(#"Inserted Merged Column", {"Combination"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Count", Order.Descending}})
in
#"Sorted Rows"
It's a tough one with multiple combinations and order sequence matters. A not so complete answer for only the first two items would be:
Formulas Layout
In Cell K2 =E2&" "&F2
In Cell M2 =COUNTIF($E:$I,L2)
In Cell O2 =COUNTIFS($K:$K,$L2&" "&O$1)
That would only add up the first two items in each order in a matrix style layout and I added conditional formatting for viewing higher numbers in the matrix.
I am trying to run a query function in google sheet for every columns separately, but it automatically sort the output in alphabetically order as shown in below image which I don't want. Also, it is not working for number entry as in column H and I (might be due to IF NUMBER function). Please help me.
Image
My function is(separate for every column) -
=ARRAYFORMULA(INDEX(QUERY({INDEX(QUERY(A2:B,
"select A, count(A) where A is not null group by A pivot B", 0), , 1),
REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(ISNUMBER(QUERY(A2:B,
"select count(A) where A is not null group by A pivot B", 0)), INDEX(QUERY({A2:A,B2:B&";"},
"select count(Col1) where Col1 is not null group by Col1 pivot Col2 offset 1", 0), 1,), ))
, , 999^99))), ";$", )}, "offset 1", 0), , 1))
Try
={A1:D1;
ARRAY_CONSTRAIN(transpose({
transpose(unique(A2:A));
arrayformula(regexreplace(trim(query(arrayformula(if(A2:A=transpose(unique(A2:A)),B2:B&",",)),,9^9)),"[,\s]+$",""))
}),counta(unique(A2:A)),2),
ARRAY_CONSTRAIN(transpose({
arrayformula(regexreplace(trim(query(arrayformula(if(A2:A=transpose(unique(A2:A)),C2:C&",",)),,9^9)),"[,\s]+$",""))
}),counta(unique(A2:A)),2),
ARRAY_CONSTRAIN(transpose({
arrayformula(regexreplace(trim(query(arrayformula(if(A2:A=transpose(unique(A2:A)),D2:D&",",)),,9^9)),"[,\s]+$",""))
}),counta(unique(A2:A)),2)}
I have a table with 4 columns. Variable1, Variable2, Kpi1 and Kpi2.
Variable1 is one level above Varible2 (i.e, variable1 is the parent of variable2).
Kpi1 is an integer and Kpi2 is a float [ range (0,1) ].
When making a pivot table, variable2 looks fine with its values, but the column total (variable1) doesn't. Kpi2 can't be calculated as a simple sum or a simple average of its values of variable2. It needs to be a weighted average of it using kpi1.
To make it more clear I will leave here an example I did on Excel.
Is there any form I can achieve this?
You will need to add a helper column & a calculated field to pivot table to do this.
New Table Column = Product = kp1 * kp2
Calculated Field = Weight = Product / kp1
You can add or remove fields from the pivot table once completed
i have a table with array columns all_available_tags and used_tags.
example row1:
all_available_tags:A,B,C,D
used_tags:A,B
example row2:
all_available_tags:B,C,D,E,F
used_tags:F
I want to get distinct set of all_available_tags from all rows and do except the set with all used_tags from all rows. from example above, all_available_tags of all rows would be A,B,C,D,E,F and all used_tags would be A,B,F. the end result i am looking for is C,D,E
I think i need to somehow pivot the table but there could be 100s of different tags, so it is not practical to list out everyone of them. is there a good way to do this?
You can try:
with tags(at, ut) as
(
select "A,B,C,D", "A,B"
union all
select "B,C,D,E,F", "F"
)
select splitat
from tags
cross join unnest(split(at, ",")) as t1 splitat
except
select splitut
from tags
cross join unnest(split(ut, ",")) as t2 splitut
Are there efficient ways to process data column-wise (vs row-wise) in spark?
I'd like to do some whole-database analysis of each column. I'd like to iterate through each column in a database and compare it to another column with a significance test.
colA = "select id, colA from table1"
foreach table, t:
foreach id,colB in t: # "select id, colB from table2"
# align colA,colB by ID
ab = join(colA,colB)
yield comparefunc(ab)
I have ~1M rows but ~10k columns.
Issuing ~10k selects is very slow, but shouldn't I be able to do a select * and broadcast each column to a different node for processing.