Execute Kusto query present in the Table result - azure

I have a table that has many other columns such as username, hostname, etc. One of the columns also stores a certain Query.
UserQueryTable
Username
Hostname
CustomQuery
Sam
xyz
some_query_1
David
abc
some_query_2
Rock
mno
some_query_3
Well
stu
some_query_4
When I run a kql such as :
UserQueryTable | where Username == "Sam"
I get:
Username
Hostname
CustomQuery
Sam
xyz
some_query_1
Note the "some_query_1" value under CustomQuery? That is an actual KQL query that is also part of the table result. I want to find a way where I can retrieve the "some_query_1" and EXECUTE it right after my KQL "UserQueryTable | where Username == "Sam""
That CustomQuery query will give me additional info about my alert and I need to get that Query string from the table and execute it.
The CustomQuery in the table looks something like this
let alertedEvent = datatable(compressedRec: string)
[' -----redacted----7ziphQG4Di05dfsdfdsgdgS6uThq4H5fclBccCH6wW8M//sdfgty==']
| extend raw = todynamic(zlib_decompress_from_base64_string(compressedRec)) | evaluate bag_unpack(raw) | project-away compressedRec;
alertedEvent
So basically the 1st Query returns a result where one of the returned column itself contains Queries and I want to be able to run the returned Queries.
The Query_ == CustomQuery
I tried using the User-defined functions but have not been able to come up with something that works. Please help!

AFAIK, for security reasons, you can't do that.
To accomplish what you want you would need to write a client app that get the results of the first query and the runs the second ones.
For reference you can't even reference table names "dynamically" on KQL.

If I understand your question correctly, you have a query that returns a list of queries and would like to get as a result a list of queries out of this set that was actually run.
In that case, you can:
Use .show queries command which returns the list of queries that was executed on your cluster(read more here). Notice .show queries would return the list of queries you ran, or - if you have database admin permissions - the list of queries anyone ran on the database.
Enable diagnostic settings on your cluster, and send Query logs (read more here). This would send all queries that were executed on your cluster to a log analytics workspace of your choosing.
You can then use either of these options and join with your table to figure out which queries were actually executed. For instance, using the first option:
.show queries | join datatable (Query_: string)
[
"Table | where somecol contains 1",
"Table | where somecol contains 2"
] on $left.Text == $right.Query_

There is a better solution than running the query, you can extract the compressed text with a regex and decompress in the same line
Table
| extend Compressed = extract(#"\['([^;]+)']",1,<CompressedTextQuery>)
| extend raw = todynamic(zlib_decompress_from_base64_string(Compressed))

Related

How to stuff a result of a query into a variable and use it another query in a logic app

I haven't used logic apps a lot, my boss is having trouble stuffing the results of one query into a variable and then using that variable in another query.
Basically, all he wants to do is get a list of of Id's returned from the first query and use that list in the second.
Here is a picture of what his logic app looks like:
You can see at the end of the second query he wants to check if the id is in the list or not. He's out for the day and I'm not sure if that variable is even receiving the list of id's successfully, but is there anything from the picture that you can tell that needs to be corrected? Or any suggestions that he could try, to achieve what he's trying to achieve?
According to the image, no data is getting stored into the variable AppId. While in the query you can just directly use c.EntityId. Below query to check if c.id is present in c.EntityId.
SELECT c.Vechicle.GrossVechicleWeight as GVW, c.EntityId as ApplicationId FROM c where c.RiskTypeId = 1 and c.Discriminator = 'RiskEntity' and c.EntityTypeId = 4500 and c.id in (c.EntityId)
Consider if you are trying to store c.Entity into AppId variable then you can Query SELECT c.EntityId FROM c and then store the result into the variable using Append to array variable action by extracting only c.EntityId using Parse JSON.
Here is my logic app
RESULT:

Azure Data Factory Error: "incorrect syntax near"

I'm trying to do a simple incremental update from an on-prem database as source to Azure SQL database based on a varchar column called "RP" in On-Prem database that contains "date+staticdescription" for example: "20210314MetroFactory"
1- I've created a Lookup activity called Lookup1 using a table created in Azure SQL Database and uses this Query
"Select RP from SubsetwatermarkTable"
2- I've created a Copy data activity where the source settings have this Query
"Select * from SourceDevSubsetTable WHERE RP NOT IN '#{activity('Lookup1').output.value}'"
When debugging -- I'm getting the error:
Failure type: User configuration issue
Details: Failure happened on 'Source' side.
'Type=System.Data.SqlClient.SqlException,Message=Incorrect syntax near
'[{"RP":"20210307_1Plant
1KAO"},{"RP":"20210314MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"2'.,Source=.Net
SqlClient Data
Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect
syntax near
'[{"RP":"20210311MetroFactory"},{"RP":"20210311MetroFactory"},{"RP":"202103140MetroFactory"},{"RP":"20210308MetroFactory"},{"RP":"2'.,},],'
Can anyone tell me what I am doing wrong and how to fix it even if it requires creating more activities.
Note: There is no LastModifiedDate column in the table. Also I haven't yet created the StoredProcedure that will update the Lookup table when it is done with the incremental copy.
Steve is right as to why it is failling and the query you need in the Copy Data.
As he says, you want a comma-separated list of quoted values to use in your IN clause.
You can get this more easily though - from your Lookup directly using this query:-
select stuff(
(
select ','''+rp+''''
from subsetwatermarktable
for xml path('')
)
, 1, 1, ''
) as in_clause
The sub-query gets the comma separated list with quotes around each rp-value, but has a spurious comma at the start - the outer query with stuff removes this.
Now tick the First Row Only box on the Lookup and change your Copy Data source query to:
select *
from SourceDevSubsetTable
where rp not in (#{activity('lookup').output.firstRow.in_clause})
The result of #activity('Lookup1').output.value is an array like your error shows
[{"RP":"20210307_1Plant
1KAO"},{"RP":"20210314MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"20210312MetroFactory"},{"RP":"2'.,Source=.Net
SqlClient Data
Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect
syntax near
'[{"RP":"20210311MetroFactory"},{"RP":"20210311MetroFactory"},{"RP":"202103140MetroFactory"},{"RP":"20210308MetroFactory"},{"RP":"2'.,},]
However, your SQL should be like this:Select * from SourceDevSubsetTable WHERE RP NOT IN ('20210307_1Plant 1KAO','20210314MetroFactory',...).
To achieve this in ADF, you need to do something like this:
create three variables like the following screenshot:
loop your result of #activity('Lookup1').output.value and append 'item().RP' to arrayvalues:
expression:#activity('Lookup1').output.value
expression:#concat(variables('apostrophe'),item().RP,variables('apostrophe'))
3.cast arrayvalues to string and add parentheses by Set variable activity
expression:#concat('(',join(variables('arrayvalues'),','),')')
4.copy to your Azure SQL database
expression:Select * from SourceDevSubsetTable WHERE RP NOT IN #{variables('stringvalues')}

SQLAlchemy query with conditional filters and results

I'm building a fastAPI app and I have a complicated query that I'm trying to avoid doing as multiple individual queries where I concat the results.
I have the following tables that all have foreign keys:
CHANGE_LOG: change_id | original (FK ROSTER.shift_id) | new (FK ROSTER.shift_id) | change_type (FK CONFIG_CHANGE_TYPES)
ROSTER: shift_id | shift_type (FK CONFIG_SHIFT_TYPES) | shift_start | shift_end | user_id (FK USERS)
CONFIG_CHANGE_TYPES: change_type_id | change_type_name
CONFIG_SHIFT_TYPES: shift_type_id | shift_type_name
USERS: user_id | user_name
FK= Foreign Key
I need to return the following information:
user_name, change_type_name, and shift_start shift_end and shift_type_name for those whose shift_id matches the original or new in the CHANGE_LOG row.
The catch is that the CHANGE_LOG table might have both original and new, only an original but no new, or only a new but no original. But as the user can select a few options from drop down boxes before submitting the request, I also need to be able to include a filter to single out:
just one user, or all users
any change_type, or a group of change_types
The issue is that I can't find a way to get the user_name guaranteed for each row without inspecting it afterwards because I don't know if the new or original exist or are set to null.
Is there a way in SQLalchemy to have an optional filter in the query where I can say if the original exists use that to get the user_id, but if not then use the new to get the user_id.
Also, if i have a query that definitely finds those with original and new shifts, it will never find those with only one of them as the criteria will never match.
I've also read this and similar ones, and while they'll resolve the issue of conditionally setting some of the filters, it doesn't get around the issue of part nulls returning nothing at all, rather than half the data.
This one seems to solve that problem, but I have no idea how to implement it.
I know it's complicated, so let me know if I've done a poor job of explaining the question.
Sorted. The solution was to use the outerjoin option.
I'm sure the syntax can be more elegant than my solution if I properly engage in adding relationships when defining each class, but what I end up with is explicit and I think it makes it easier to read... at least for me.
Since I'm using a few tables more than once in the same query for different information, it was important to alias those, otherwise I ended up with a conflict (which 'user_id' did you want - it's not clear). For those playing at home, here's my general solution:
new=aliased(ROSTER)
original=aliased(ROSTER)
o_name=aliased(CONFIG_SHIFT_TYPES)
n_name=aliased(CONFIG_SHIFT_TYPES)
pd.read_sql(
db.query(
CHANGE_LOG.change_id,
CHANGE_LOG.created,
CONFIG_CHANGE_TYPES.change_name,
o_name.shift_name.label('original_type'),
n_name.shift_name.label('new_type'),
OPERATORS.operator_name
)
.outerjoin(original, original.shift_id==CHANGE_LOG.original_shift)
.outerjoin(new, new.shift_id==CHANGE_LOG.new_shift)
.outerjoin (CONFIG_CHANGE_TYPES,CONFIG_CHANGE_TYPES.change_id==CHANGE_LOG.change_type)
.outerjoin(CONFIG_SHIFT_TYPES, CONFIG_SHIFT_TYPES.shift_id==new.roster_shift_id)
.outerjoin(o_name, o_name.shift_id==original.roster_shift_id)
.outerjoin(n_name, n_name.shift_id==new.roster_shift_id)
.outerjoin(USERS, or_(USERS.operator_id==original.user_id, USERS.user_id==new.user_id)
).statement, engine)

Cant get identifier and Max Value CosmostDb

I would like to do some reporting on my CosmosDb
my Query is
Select Max(c.results.score) from c
That works but i want the id of the highest score then i get an exception
Select c.id, Max(c.results.score) from c
'c.id' is invalid in the select list because it is not contained in an
aggregate function
you can execute following query to archive what you're asking (thought it can be not very efficient in RU/execution time terms):
Select TOP 1 c.id, c.results.score from c ORDER BY c.results.score DESC
Group by isn't supported natively in Cosmos DB so there is no out of the box way to execute this query.
To implement this using the out of the box functionality you would need to create a new document type that contains the output of your aggregation e.g.
{
"id" : 1,
"highestScore" : 1000
}
You'd then need a process within your application to keep this up-to-date.
There is also documentdb-lumenize that would allow you to do this using stored procedures. I haven't used it myself but it may be worth looking into as an alternative to the above solution.
Link is:
https://github.com/lmaccherone/documentdb-lumenize

Sourcing data from DocumentDB in Hadoop

I have a hadoop application that source data from two different DocumentDB collection. However, the json schema of documents belonging to these two collections are different. Both has a field showing time, but one is called TimeStamp and the other one is called UpdatedOn. I'd like to know how I can specify a query which is based on this time field and retrive only those json documents satisfying the condition in my query. I specify my query like below
String query = "SELECT * FROM c WHERE c.Timestamp > " + timestamp;
conf.set(ConfigurationUtil.QUERY, query);
This query applies on one of the collection. I need a query like below
"SELECT * FROM collection1 as c1, collection2 as c2 WHERE c1.Timestamp > x1 OR c2.UpdatedOn > x1"
Is this supported in DocumentDB?
This is not supported since it is not documented, your best bet is two execute these two queries and then merge the results using Linq or any other technique to get one result set.
Hope this helps.

Resources