Is there a way to perform string operators on elements of a list in KQL? - azure

I'm trying to whitelist a bunch of domains from Azure sentinel rules based on the !hassuffix string operator.
Im trying to do something like this:
AzureDiagnostics
| where destinationDomain !hassuffix ".google.com" and destinationDomain !hassuffix ".azure.com"
But because there is going to be a lot of whitelisted domains and subdomains would like to store the root domain/subdomains in a list which will be in blob storage like:
let whitelist = dyanmic([".google.com", ".azure.com" .........])
Does anyone know the syntax to iterate through each of these and check whether the destinationDomain !hassuffix to each of the dynamic array elements? Or is the only way to have a wall of and's? Thanks

There's no such functionality. You should use matches regex instead.

This is perhaps not the most efficient way to do it, but you could do a cross product type join across the whitelist, perform the !hassuffix check on each, then see the count of how many passed (failed I guess?) the check. For a smallish whitelist and table it should do OK and would be easier to modify / maintain.
let AzureDiagnostics = datatable(destinationDomain: string)
[
"test.google.com",
"test.notallowed.com",
"other.azure.com",
"alsonotallowed.azure2.com"
];
let Whitelist = datatable(allowedSuffix: string, dummy: long)
[
".google.com", 1,
".azure.com", 1,
];
AzureDiagnostics
| extend dummy=1 // add a dummy column for cross product join
| lookup Whitelist on dummy // do cross product (lookup used assuming Whitelist is small)
| where destinationDomain !hassuffix(allowedSuffix) // perform the suffix check
| summarize count() by destinationDomain // since the list was broken up, get the count of passes
| where count_ == toscalar(Whitelist | count) // if the !hassuffix was true for all (the count) keep the result
| project destinationDomain // get rid of the count column

Related

Get a category name from the id in the url using kusto query

I need to get the category name from category id using kusto query.
First i have got the most searched url from the website using the below kusto query and ran it in app insights logs
Requests
| where resultCode==200
| where url contains "bCatID"
| summarize count=sum(itemCount) by url
| sort by count
| take 1
From the above query i got the result like
https://www.test.com/item.aspx?idItem=123456789&bCatID=1282
So for corresponding categoryid=1282 i need to get the category name using kusto
you can use the parse operator.
for example:
print input = 'https://www.test.com/item.aspx?idItem=123456789&bCatID=1282'
| parse input with 'https://www.test.com/item.aspx?idItem=123456789&bCatID='category_id:long
parse_urlquery
print url ="https://www.test.com/item.aspx?idItem=123456789&bCatID=1282"
| extend tolong(parse_urlquery(url)["Query Parameters"]["bCatID"])
url
Query Parameters_bCatID
https://www.test.com/item.aspx?idItem=123456789&bCatID=1282
1282
Fiddle
Feedback to the OP query
Not an answer
KQL is case sensitive. The name of the table in Azure Application Insights is requests (and not Requests).
resultCode is of type string (and not integer/long) and should be compared to "200" (and not 200)
bCatID is a token and therefore can be searched using has or even has_cs, which should be preferred over contains due to performance reasons.
URLs can be used with different parameters. It might make more sense to summarize only by the Host & Path parts + the bCatID query parameter.
count is a reserved word. It can be used as alias only if qualified: ["count"] or ['count'] (or better not used at all).
sort followed by take 1 can be replaced with the more elegant top 1 by ...
requests
| where resultCode == "200"
| project url = parse_url(url), itemCount
| summarize sum(itemCount) by tostring(url.Host), tostring(url.Path), tolong(url["Query Parameters"]["bCatID"])
| top 1 by sum_itemCount

django filter with related name [duplicate]

What is the difference between filter with multiple arguments and chain filter in django?
As you can see in the generated SQL statements the difference is not the "OR" as some may suspect. It is how the WHERE and JOIN is placed.
Example1 (same joined table): from https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships
Blog.objects.filter(
entry__headline__contains='Lennon',
entry__pub_date__year=2008)
This will give you all the Blogs that have one entry with both (entry__headline__contains='Lennon') AND (entry__pub_date__year=2008), which is what you would expect from this query.
Result:
Blog with {entry.headline: 'Life of Lennon', entry.pub_date: '2008'}
Example 2 (chained)
Blog.objects.filter(
entry__headline__contains='Lennon'
).filter(
entry__pub_date__year=2008)
This will cover all the results from Example 1, but it will generate slightly more result. Because it first filters all the blogs with (entry__headline__contains='Lennon') and then from the result filters (entry__pub_date__year=2008).
The difference is that it will also give you results like:
A single Blog with multiple entries
{entry.headline: '**Lennon**', entry.pub_date: 2000},
{entry.headline: 'Bill', entry.pub_date: **2008**}
When the first filter was evaluated the book is included because of the first entry (even though it has other entries that don't match). When the second filter is evaluated the book is included because of the second entry.
One table: But if the query doesn't involve joined tables like the example from Yuji and DTing. The result is same.
The case in which results of "multiple arguments filter-query" is different than "chained-filter-query", following:
Selecting referenced objects on the basis of referencing objects and relationship is one-to-many (or many-to-many).
Multiple filters:
Referenced.filter(referencing1_a=x, referencing1_b=y)
# same referencing model ^^ ^^
Chained filters:
Referenced.filter(referencing1_a=x).filter(referencing1_b=y)
Both queries can output different result:
If more then one
rows in referencing-modelReferencing1can refer to same row in
referenced-modelReferenced. This can be the case in Referenced:
Referencing1 have either 1:N (one to many) or N:M (many to many)
relation-ship.
Example:
Consider my application my_company has two models Employee and Dependent. An employee in my_company can have more than dependents(in other-words a dependent can be son/daughter of a single employee, while a employee can have more than one son/daughter).
Ehh, assuming like husband-wife both can't work in a my_company. I took 1:m example
So, Employee is referenced-model that can be referenced by more then Dependent that is referencing-model. Now consider relation-state as follows:
Employee: Dependent:
+------+ +------+--------+-------------+--------------+
| name | | name | E-name | school_mark | college_mark |
+------+ +------+--------+-------------+--------------+
| A | | a1 | A | 79 | 81 |
| B | | b1 | B | 80 | 60 |
+------+ | b2 | B | 68 | 86 |
+------+--------+-------------+--------------+
Dependenta1refers to employeeA, and dependentb1, b2references to employeeB.
Now my query is:
Find all employees those having son/daughter has distinction marks (say >= 75%) in both college and school?
>>> Employee.objects.filter(dependent__school_mark__gte=75,
... dependent__college_mark__gte=75)
[<Employee: A>]
Output is 'A' dependent 'a1' has distinction marks in both college and school is dependent on employee 'A'. Note 'B' is not selected because nether of 'B''s child has distinction marks in both college and school. Relational algebra:
Employee ⋈(school_mark >=75 AND college_mark>=75)Dependent
In Second, case I need a query:
Find all employees whose some of dependents has distinction marks in college and school?
>>> Employee.objects.filter(
... dependent__school_mark__gte=75
... ).filter(
... dependent__college_mark__gte=75)
[<Employee: A>, <Employee: B>]
This time 'B' also selected because 'B' has two children (more than one!), one has distinction mark in school 'b1' and other is has distinction mark in college 'b2'.
Order of filter doesn't matter we can also write above query as:
>>> Employee.objects.filter(
... dependent__college_mark__gte=75
... ).filter(
... dependent__school_mark__gte=75)
[<Employee: A>, <Employee: B>]
result is same! Relational algebra can be:
(Employee ⋈(school_mark >=75)Dependent) ⋈(college_mark>=75)Dependent
Note following:
dq1 = Dependent.objects.filter(college_mark__gte=75, school_mark__gte=75)
dq2 = Dependent.objects.filter(college_mark__gte=75).filter(school_mark__gte=75)
Outputs same result: [<Dependent: a1>]
I check target SQL query generated by Django using print qd1.query and print qd2.query both are same(Django 1.6).
But semantically both are different to me. first looks like simple section σ[school_mark >= 75 AND college_mark >= 75](Dependent) and second like slow nested query: σ[school_mark >= 75](σ[college_mark >= 75](Dependent)).
If one need Code #codepad
btw, it is given in documentation #Spanning multi-valued relationships I have just added an example, I think it will be helpful for someone new.
Most of the time, there is only one possible set of results for a query.
The use for chaining filters comes when you are dealing with m2m:
Consider this:
# will return all Model with m2m field 1
Model.objects.filter(m2m_field=1)
# will return Model with both 1 AND 2
Model.objects.filter(m2m_field=1).filter(m2m_field=2)
# this will NOT work
Model.objects.filter(Q(m2m_field=1) & Q(m2m_field=2))
Other examples are welcome.
This answer is based on Django 3.1.
Environment
Models
class Blog(models.Model):
blog_id = models.CharField()
class Post(models.Model):
blog_id = models.ForeignKeyField(Blog)
title = models.CharField()
pub_year = models.CharField() # Don't use CharField for date in production =]
Database tables
Filters call
Blog.objects.filter(post__title="Title A", post__pub_year="2020")
# Result: <QuerySet [<Blog: 1>]>
Blog.objects.filter(post__title="Title A").filter(post_pub_date="2020")
# Result: <QuerySet [<Blog: 1>, [<Blog: 2>]>
Explanation
Before I start anything further, I have to notice that this answer is based on the situation that uses "ManyToManyField" or a reverse "ForeignKey" to filter objects.
If you are using the same table or an "OneToOneField" to filter objects, then there will be no difference between using a "Multiple Arguments Filter" or "Filter-chain". They both will work like a "AND" condition filter.
The straightforward way to understand how to use "Multiple Arguments Filter" and "Filter-chain" is to remember in a "ManyToManyField" or a reverse "ForeignKey" filter, "Multiple Arguments Filter" is an "AND" condition and "Filter-chain" is an "OR" condition.
The reason that makes "Multiple Arguments Filter" and "Filter-chain" so different is that they fetch results from different join tables and use different conditions in the query statement.
"Multiple Arguments Filter" use "Post"."Public_Year" = '2020' to identify the public year
SELECT *
FROM "Book"
INNER JOIN ("Post" ON "Book"."id" = "Post"."book_id")
WHERE "Post"."Title" = 'Title A'
AND "Post"."Public_Year" = '2020'
"Filter-chain" database query use "T1"."Public_Year" = '2020' to identify the public year
SELECT *
FROM "Book"
INNER JOIN "Post" ON ("Book"."id" = "Post"."book_id")
INNER JOIN "Post" T1 ON ("Book"."id" = "T1"."book_id")
WHERE "Post"."Title" = 'Title A'
AND "T1"."Public_Year" = '2020'
But why do different conditions impact the result?
I believe most of us who come to this page, including me =], have the same assumption while using "Multiple Arguments Filter" and "Filter-chain" at first.
Which we believe the result should be fetched from a table like following one which is correct for "Multiple Arguments Filter". So if you are using the "Multiple Arguments Filter", you will get a result as your expectation.
But while dealing with the "Filter-chain", Django creates a different query statement which changes the above table to the following one. Also, the "Public Year" is identified under the "T1" section instead of the "Post" section because of the query statement change.
But where does this weird "Filter-chain" join table diagram come from?
I'm not a database expert. The explanation below is what I understand so far after I created the exact structure of the database and made a test with the same query statement.
The following diagram will show where this weird "Filter-chain" join table diagram comes from.
The database will first create a join table by matching the row of the "Blog" and "Post" tables one by one.
After that, the database now does the same matching process again but uses the step 1 result table to match the "T1" table which is just the same "Post" table.
And this is where this weird "Filter-chain" join table diagram comes from.
Conclusion
So two things make "Multiple Arguments Filter" and "Filter-chain" different.
Django create different query statements for "Multiple Arguments Filter" and "Filter-chain" which make "Multiple Arguments Filter" and "Filter-chain" result come from other tables.
"Filter-chain" query statement identifies a condition from a different place than "Multiple Arguments Filter".
The dirty way to remember how to use it is "Multiple Arguments Filter" is an "AND" condition and "Filter-chain" is an "OR" condition while in a "ManyToManyField" or a reverse "ForeignKey" filter.
The performance difference is huge. Try it and see.
Model.objects.filter(condition_a).filter(condition_b).filter(condition_c)
is surprisingly slow compared to
Model.objects.filter(condition_a, condition_b, condition_c)
As mentioned in Effective Django ORM,
QuerySets maintain state in memory
Chaining triggers cloning, duplicating that state
Unfortunately, QuerySets maintain a lot of state
If possible, don’t chain more than one filter
You can use the connection module to see the raw sql queries to compare. As explained by Yuji's, for the most part they are equivalent as shown here:
>>> from django.db import connection
>>> samples1 = Unit.objects.filter(color="orange", volume=None)
>>> samples2 = Unit.objects.filter(color="orange").filter(volume=None)
>>> list(samples1)
[]
>>> list(samples2)
[]
>>> for q in connection.queries:
... print q['sql']
...
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
SELECT `samples_unit`.`id`, `samples_unit`.`color`, `samples_unit`.`volume` FROM `samples_unit` WHERE (`samples_unit`.`color` = orange AND `samples_unit`.`volume` IS NULL)
>>>
If you end up on this page looking for how to dynamically build up a django queryset with multiple chaining filters, but you need the filters to be of the AND type instead of OR, consider using Q objects.
An example:
# First filter by type.
filters = None
if param in CARS:
objects = app.models.Car.objects
filters = Q(tire=param)
elif param in PLANES:
objects = app.models.Plane.objects
filters = Q(wing=param)
# Now filter by location.
if location == 'France':
filters = filters & Q(quay=location)
elif location == 'England':
filters = filters & Q(harbor=location)
# Finally, generate the actual queryset
queryset = objects.filter(filters)
If requires a and b then
and_query_set = Model.objects.filter(a=a, b=b)
if requires a as well as b then
chaied_query_set = Model.objects.filter(a=a).filter(b=b)
Official Documents:
https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships
Related Post: Chaining multiple filter() in Django, is this a bug?
There is a difference when you have request to your related object,
for example
class Book(models.Model):
author = models.ForeignKey(Author)
name = models.ForeignKey(Region)
class Author(models.Model):
name = models.ForeignKey(Region)
request
Author.objects.filter(book_name='name1',book_name='name2')
returns empty set
and request
Author.objects.filter(book_name='name1').filter(book_name='name2')
returns authors that have books with both 'name1' and 'name2'
for details look at
https://docs.djangoproject.com/en/dev/topics/db/queries/#s-spanning-multi-valued-relationships

Kusto query with filter depending on dashboard parameter

I want to be able to toggle a filter on my query via a parameter in dashboard. How can I turn the "where" operator off?
E.g. the parameter in the dashboard is "_toggle"
let _filter = dynamic(["A", "B"]);
Table
| where id in (_filter) // execute this line only if _toggle == true
| project id
I already tried creating a second list containing all the ids and toggle between the small an the complete list via iff() but this is too resource intensive.
you could try something like this:
let _filter = dynamic(["A", "B"]);
Table
| where _toggle == false or id in (_filter)
| project id

Is it possible to get query results in an App Insights inline function?

I am trying to write an App Insights query that will report back the timespan between two known events, specifically circuit breaker open and close events. The assumption is that these events always occur in pairs, so we need to know the time between the two for every occurrence in a time period.
My first attempt was to use an inline function. Simplified version below.
let timeOpened = (timeClosed:datetime)
{
let result = customEvents
| where name == 'CircuitBreakerStatusChange'
| where customDimensions['State'] == 'Open'
| where timestamp < timeClosed
| order by timestamp desc
| take 1
| project timestamp;
let scalar = toscalar(result);
scalar
};
customEvents
| where timestamp > ago(4h)
| where name == 'CircuitBreakerStatusChange'
| where customDimensions['State'] == 'Closed'
| extend timeOpen = timestamp - timeOpened(timestamp)
There may be a better way to do this. If so your ideas are welcome! But in this particular attempt the only feedback I get from Azure when running this is "Syntax error". However, I don't believe there's a syntax error here because if I just change the return value of the function from scalar to now() it runs successfully. Also I can run the body of the function in isolation successfully. Any idea what's wrong here?
I think you are getting syntax error because query language does not allow possibly recursive constructs. Now() worked because it was statically (not dynamically) retrieved at the query time.
I think you may achieve the desired outcome with serialize and prev() operators:
Table | order by timestamp asc | serialize
| extend previousTime = prev(timestamp,1)
| extend Diff = iff(customDimensions['State'] == 'Closed', timestamp - previousTime, 0)
| where Diff > 0
Note: I haven't tested the example above and it may need some additional thought to make it work (e.g. making sure that the previous record is actually "Opened" before doing previousTime calculation).

How can I get the Scenario Outline Examples' table?

In AfterScenario method, I want to get the rows from table "Examples" in Scenario Outline, to get the values in it, and search that specific values in the database
I know that this can be achieved by using Context.Scenario.Current...
Context.Scenario.Current[key]=value;
...but for some reason I'd like to be able to get it in a simpler way
like this:
ScenarioContext.Current.Examples();
----------- SCENARIO --------------------------------
Scenario Outline: Create a Matter
Given I create matter "< matterName >"
Examples:
| matterName |
| TAXABLE |
----------AFTER SCENARIO -----------------------------------
[AfterScenario()]
public void After()
{
string table = ScenarioContext.Current.Examples();
}
So if you look at the code for ScenarioContext you can see it inherits from SpecflowContext which is itself a Dictionary<string, object>. This means that you can simply use Values to get the collection of values, but I have no idea if they are Examples or not.
The best solution I came up with was to infer the examples by keeping my own static singleton object, then counting how many times the same scenario ran.
MyContext.Current.Counts[ScenarioContext.Current.ScenarioInfo.Title]++;
Of course, this doesn't work very well if you don't run all the tests at the same time or run them in random order. Having a table with the examples themselves would be more ideal, but if you combine my technique along with using ScenarioStepContext you could extract the parameters of the Examples table out of the rendered step definition text itself.
Feature
Scenario Outline: The system shall do something!
Given some input <input>
When something happens
Then something should have happened
Examples:
| input |
| 1 |
| 2 |
| 3 |
SpecFlow Hook
[BeforeStep]
public void BeforeStep()
{
var text = ScenarioStepContext.Current.StepInfo.Text;
var stepType = ScenarioStepContext.Current.StepInfo.StepDefinitionType;
if (text.StartsWith("some input ") && stepType == StepDefinitionType.Given)
{
var input = text.Split(' ').Last();
}
}

Resources