Kusto Usecases (Nested User Defined Functions) - nested

Could you please let me know if the following usecases can be handled in Kusto?
Nested User defined queries (Can I call one function in the other function)?Here are the two functions
.create function ifnotexists
with()
function1() {table1 | where column1 == abs}
.create function ifnotexists
with()
function2() {table2 | where column1 in (function1())}
Also How to add row level security policy to the nested function - function2()?The below does not seem to work
.alter table table2 policy row_level_security enable "function2"
2.How do I Upsert a row (Can I create a new table and migrate the schema). If so, Is it possible to make users have access to just the functions and not the underlying tables??
Note: I can write a function to point to a new table with upserted row

It is possible to use other functions inside a function. Nothing special - your example should work.
Migrating table can be done with .set-or-append command:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/data-ingestion/ingest-from-query
It is not possible to let user access functions only. Some scenarios of user-restriction can be solved with Row-level-security: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/rowlevelsecuritypolicy
Some scenarios of "changing" rows can be done with "summarize arg_max(Time, *)" clause - where Time is the time of the update.
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/arg-max-aggfunction

Related

Arrays of composite types in PostgreSQL via NodeJS

I'm using Node.JS ("pg" package) to connect to a PostgreSQL database hosted on Heroku. I need to create a column in my table that will contain an array of different data types. By looking at other questions previously asked on Stackoverflow, I understand i can create composite data types that I can use to declare the array with. Like:
create type my_item as (
field_1 text,
field_2 text,
field_3 text,
field_4 number
);
However, I don't understand how to implement this when using Node.JS. Where do I put it in my files and at what point do I run it?
I have an index.JS file containing my Pool instance and the database access info. My functions are stored in a models folder. Each function has its own SqlString variable which is then passed to the query. Like:
export async function getScores() {
const data = await query(`SELECT * FROM score`);
return data.rows;
}
Appreciate any help.
There is no such thing as array of different composite types in Postgresql. You might need to store the column as json/jsonb type instead and deal with them at the application level. Or create a superset type of all possible types in the array and deal with NULLs at the application level. That only works if the subset types don't overlap different types on the same key.
Also the main usecase for composites is related to INSERT/UPDATE/DELETE queries, aka anything that requires value interpolation from the application. Of course it's no use in your example code.

How to fetch Primary Key/Clustering column names for a particular table using CQL statements?

I am trying to fetch the Primary Key/Clustering Key names for a particular table/entity and implement the same query in my JPA interface (which extends CassandraRepository).
I am not sure whether something like:
#Query("DESCRIBE TABLE <table_name>)
public Object describeTbl();
would work here as describe isn't a valid CQL statement and in case it would, what would be the type of the Object?
Suggestions?
One thing you could try, would be to query the system_schema.columns table. It is keyed by keyspace_name and table_name, and might be what you're looking for here:
> SELECT column_name,kind FROM system_schema.columns
WHERE keyspace_name='spaceflight_data'
AND table_name='astronauts_by_group';
column_name | kind
-------------------+---------------
flights | regular
group | partition_key
name | clustering
spaceflight_hours | clustering
(4 rows)
DESCRIBE TABLE is supported only in Cassandra 4 that includes fix for CASSANDRA-14825. But it may not help you much because it just returns the text string representing the CREATE TABLE statement, and you'll need to parse text to extract primary key definition - it's doable but could be tricky, depending on the structure of the primary key.
Or you can obtain underlying Session object and via getMetadata function get access to actual metadata object that allows to obtain information about keyspaces & tables, including the information about schema.

Cant get identifier and Max Value CosmostDb

I would like to do some reporting on my CosmosDb
my Query is
Select Max(c.results.score) from c
That works but i want the id of the highest score then i get an exception
Select c.id, Max(c.results.score) from c
'c.id' is invalid in the select list because it is not contained in an
aggregate function
you can execute following query to archive what you're asking (thought it can be not very efficient in RU/execution time terms):
Select TOP 1 c.id, c.results.score from c ORDER BY c.results.score DESC
Group by isn't supported natively in Cosmos DB so there is no out of the box way to execute this query.
To implement this using the out of the box functionality you would need to create a new document type that contains the output of your aggregation e.g.
{
"id" : 1,
"highestScore" : 1000
}
You'd then need a process within your application to keep this up-to-date.
There is also documentdb-lumenize that would allow you to do this using stored procedures. I haven't used it myself but it may be worth looking into as an alternative to the above solution.
Link is:
https://github.com/lmaccherone/documentdb-lumenize

SQL Query in Sequelize getter method

I'm using the Postgres extension 'earthdistance' for lat/long distance calculation.
I'm also using Sequelize to access the database and I want to define a getter
method for calculation and sorting by distance from a set of coordinates.
The following query works fine:
SELECT name,
earth_distance(ll_to_earth( 51.5241182, -0.0758046 ),
ll_to_earth(latitude, longitude)) as distance_from_current_location
FROM "Branches"
ORDER BY distance_from_current_location ASC;
And I can use it using sequelize.query(), but I want to keep all the model queries part of the model.
How can I specify WHERE conditions from inside a getter method in the model definition?
Thanks!
Your best bet is probably to wrap the query in a stored procedure and pass in the arguments you want to use in the where clause. As stored procedures are compiled, this will perform better than a Dynamic SQL where you generate the WHERE clause on the fly.
Add whatever parameters and types to your stored proc as you need, and the result will look something like this:
CREATE FUNCTION GetEarthDistance (v_Foo bigint) RETURNS type AS $$
DECLARE
v_Name varchar(256);
BEGIN
SELECT name INTO v_Name,
earth_distance(ll_to_earth( 51.5241182, -0.0758046 ),
ll_to_earth(latitude, longitude)) as distance_from_current_location
FROM Branches
WHERE somecol > v_foo
ORDER BY distance_from_current_location ASC;
RETURN v_Name;
END;
$$ LANGUAGE 'plpgsql';

Select only one row in dql subquery

I have to execute following query:
create dm_myobject object
set my_id_attribute = (select r_object_id from dm_otherobject where <some clause here>)
where ...
But subquery in brackets returns more than one id. I can't make whereclause more detailed to retrieve only one value.
How to take first?
ENABLE(FETCH_ALL_RESULTS 1) or ENABLE(RETURN_TOP 1) doesn't help.
In my experience it is impossible to use DQL hints in a sub query like you suggested, because the hint is applied to the query as a whole. It is indeed possible to use, say, ENABLE(RETURN_TOP 1) on a query that contains a sub query, however that hint will then be used on the outer query and never on the inner one. In your case, however, you'll end up with an error message telling that the sub query returns more than one result.
Try using an aggregate function on the selected attribute instead:
CREATE dm_myobject OBJECT
SET my_id_attribute = (
SELECT MIN(r_object_id)
FROM dm_otherobject
WHERE <some clause>
)
The MIN and MAX functions work with ints and strings, and I suspect they work with IDs too. Since it is ok for you to set only the first ID that's returned from your sub query, I suspect you're returning them in a sorted order and want to use the first -- hence the usage of the MIN function.
An alternative approach would of course be to write a script or a small Java program that executes several DQL statements, but that might or might not work for you in your case.

Resources