How to "Group By" by result and count in Azure App Insights - azure

I'm trying to group some results I have in app insights and am struggling
If I were to tabulate my results, it would look like
Product Version
A 1
B 2
A 2
A 1
B 3
B 3
As you can see, I have 2 products (A and B), and each has a version number.
I am trying to group these and provide a count, so my end result is
Product Version Count
A 1 2
A 2 1
B 2 1
B 3 2
At the moment, my approach is a mess because I am doing this manually with
customEvents
| summarise A1 = count(customEvents.['payload.prod'] == "A" and myEvents.['payload.vers'] == "1"),
| summarise A2 = count(customEvents.['payload.prod'] == "A" and myEvents.['payload.vers'] == "2")
I have no idea how I can aggregate these so it can group by product and version and then count the occurrences of each

I think your are looking for:
customEvents
| extend Product = tostring(customDimensions.prod)
| extend MajorVersion = split(customDimensions.Version, ".")[0]
| summarize Count = count() by Product , tostring(MajorVersion)
I wrote this off the top off my head so there might be some syntax issues. I assumed prod and vers are in the customdimensions, let me know if it is otherwise.
You can summarize by multiple fields as you can see.

Related

VLOOKUP with criterion of max

lets say I have a Table1 as follow:
ID | Value
________________
1 | 0
2 | 0
1 | 1
3 | 1
1 | 0
2 | 0
1 | 0
2 | 0
3 | 0
4 | 1
1 | 0
5 | 0
and I have a second table that contains unique IDs from Table1.
In Table1 ID may repeat, but each ID can have at most one 1 in Value column, the rest is 0.
How can I write VLOOKUP like formula that will tell me if given ID has 1 in any occurence?
I would like to get smth like
ID | Value
________________
1 | 1
2 | 0
3 | 1
4 | 1
5 | 0
with SQL I would write smth as SELECT ID, max(Value) from Table1 group by ID, or even instead of max would use sum.
Also to mention: Table1 will be in separate file from my output table and the Value will be just one of many columns, therefore I cannot use Pivot Tables
I think the solution is easier than you might think:
=SUMIFS(B$2:B$13,A$2:A$13,1)
What are you doing? You are summing everything? I just want to know where the 1 is, no need to sum it?
Well: you seem to have two possible values: either all 0's, either all 0's and just one 1: if you search for that 1, or if you take the sum, the result is the same :-)
Ok, that's a neat trick, but what if I decide there might be more than one 1?
Well: just translate a number, larger than 1, to 1, which you can do with this formula:
=IF(E2,1,0)
There are several ways to go about it, and I'm assuming that your values are more complicated than your example, so here is one way:
=MAX(IF(A$2:A$13=E3,B$2:B$13))
Where A2:A13 is your IDs, B2:B13 is the value, and E3 is the start of your reference table. This is an array formula and needs to be confirmed with CTRL+SHIFT+ENTER
If it's as simple as 1 or 0, you should use the answer that #dominique gave.
Give a try on the following formula-
=HSTACK(UNIQUE(A2:A13),MAXIFS(B2:B13,A2:A13,UNIQUE(A2:A13)))
This will work like SQL. It will also work if you have more values than one.

How to sum values in excel only if they match a category id in a lookup

Say I have a product category lookup like so:
Sheet 1
Product Name | Product Category
--------------------------------------
product 1 | A
product 2 | A
product 3 | B
product 4 | A
product 5 | B
product 6 | C
and I also have a list of purchases which only use Product Name like this:
Sheet 2
Product Name | Purchase Quantity
---------------------------------------
product 1 | 35
product 4 | 10
product 5 | 5
I would like to produce a rollup like this:
Product Category | Purchase Quantity
------------------------------------------
A | 45
B | 5
C | 0
I've tried a variety of ways to solve this like:
SUMIF(LOOKUP('Sheet 2'!A2:A6,'Sheet 1'!A:A,'Sheet 2'!B:B), "=A", 'Sheet 2'!B2:B6)
SUMPRODUCT(LOOKUP('Sheet 2'!A2:A6, 'Sheet 1'!A:A, 'Sheet 2'!B:B)="A"*'Sheet 2'!B2:B6)
Excel doesn't like the first one. It says the formula is incorrect, but I'm not seeing why. The second one yields #VALUE. Any help on this would be much appreciated. Thanks in advance!
With A in D2, use this as an array formula.
=SUM(SUMIFS(Sheet2!B$2:B$4, Sheet2!A$2:A$4, IF(Sheet1!B$2:B$7=D2, Sheet1!A$2:A$7)))
Array formulas need to be finished with Ctrl+Shift+Enter, not just Enter.
In my opinion there is no need to use complex formulas for such an easy question. Just add another column next to Purchase Quantity sheet 2 to get the Product Category and simply use =SUMIF. I have prepare a solution to illustrate my thoughts:
Formula for VLOOKUP:
=VLOOKUP(D2,$A$2:$B$7,2,FALSE)
Formula for SUMIF:
=SUMIF($F$2:$F$4,"=" & A10,$E$2:$E$4)
Results:

Counting distinct elements from strings PostgreSQL

I am trying to count the elements contained in a string in the following way:
row features
1 'a | b | c'
2 'a | c'
3 'b | c | d'
4 'a'
Result:
feature count
a 3
b 2
c 3
d 1
I have already found a solution by finding the highest number of features and separating each feature into 1 column, so feature1 = content of feature 1 in the string, but the I have to manually aggregate the data. There must be a smart way to do this for sure as in my example.
By normalizing the data by using unnest() this turns into a simple group by
select trim(f.feature), count(*)
from the_table t
cross join lateral unnest(string_to_array(t.features, '|')) as f(feature)
group by trim(f.feature)
order by 1;
I used regexp_split_to_table:
SELECT regexp_split_to_table(feature, E' \\| ') AS k, count(*)
FROM tab1
GROUP BY k

Counting the number of older siblings in an Excel spreadsheet

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)
Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.
I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.
You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

Within Table Subquery of Identical Combinations

I would like to select groups that have the exact same attributes from a table. For example, my table is like the following
facs_run_id | fcj_id
1 | 17
1 | 4
1 | 12
2 | 17
2 | 4
2 | 12
3 | 17
3 | 12
3 | 10
In this table each facs_run_id has different combinations of fcj_id, some are shared between facs_run_id numbers while others are not. For example, above facs_run_id 1 and 2 are identical, while 3 has shared fcj_id but is not identical to 1 and 2. I would like to make query to:
gather all fcj_id from a particular facs_run_id
find all facs_run_id that have the exact same fcj_id combination.
Herein, I want to find all facs_run_id that are equal in fcj_id combinations to facs_run_id: 1, so it should return 2 (or 1 & 2).
I can get those that are missing certain fcj_id and even find which fcj_id are missing with this:
SELECT facs_run_id
FROM facs_panel
EXCEPT
SELECT fcj_id
FROM facs_panel
WHERE facs_run_id = 2;
or this:
SELECT row(fp.*, fcj.fcj_antigen, fcj.fcj_color)
FROM facs_panel fp
LEFT OUTER JOIN facs_conjugate_lookup fcj ON fcj.fcj_id = fp.fcj_id
WHERE fp.fcj_id in ( SELECT fp.fcj_id
FROM facs_panel fp
WHERE fp.facs_run_id = 1);
But I am not able to make a query that returns IDENTICAL facs_run_id. I suppose this could be considered a way of looking for aggregated duplicates, but I don't know how to do that. Any suggestions or pointers would be greatly appreciated (or a better way to create the table if this type of query will not work).
It's pretty easy with a couple CTEs:
with f1 as (select
facs_run_id,
array_agg(fcj_id) as flist
from facs_panel
group by facs_run_id),
f2 as (select flist, count(*)
from f1
group by flist
having count(*) > 1)
select f2.flist, f1. facs_run_id
from f2
join f1 on (f2.flist = f1.flist)
order by flist, facs_run_id;
The data from the question, run through this query, produces:
flist | facs_run_id
-----------+-------------
{4,12,17} | 1
{4,12,17} | 2
(2 rows)

Resources