How to work with result set of meta-functions in Vertica - subquery

I want to use result set of meta-function get_node_dependencies as a subquery. Is there some way to do it?
Something like this:
select v_txtindex.StringTokenizerDelim (dep, chr(10)) over () as words
from (
select get_node_dependencies() as dep
) t;
This query thows an error Meta-function ("get_node_dependencies") can be used only in the Select clause.
I know that there is a view vs_node_dependencies that returns the same data in more readable way, but the question is generic, not related to any specific meta-function.

Most Vertica meta functions returning a report are for informational purposes on the fly, and can only be used on the outmost part of a query - so you can't apply another function on their output.
But - as you are already prepared to go through development work to split that output into tokens, you might often be even better off by querying vs_node_dependencies directly. You'll also be more flexible - is my take on this.

Related

Select one column from Type-ORM query - Node

I have a type ORM query that returns five columns. I just want the company column returned but I need to select all five columns to generate the correct response.
Is there a way to wrap my query in another select statement or transform the results to just get the company column I want?
See my code below:
This is what the query returns currently:
https://i.stack.imgur.com/MghEJ.png
I want it to return:
https://i.stack.imgur.com/qkXJK.png
const qb = createQueryBuilder(Entity, 'stats_table');
qb.select('stats_table.company', 'company');
qb.addSelect('stats_table.title', 'title');
qb.addSelect('city_code');
qb.addSelect('country_code');
qb.addSelect('SUM(count)', 'sum');
qb.where('city_code IS NOT NULL OR country_code IS NOT NULL');
qb.addGroupBy('company');
qb.addGroupBy('stats_table.title');
qb.addGroupBy('country_code');
qb.addGroupBy('city_code');
qb.addOrderBy('sum', 'DESC');
qb.addOrderBy('company');
qb.addOrderBy('title');
qb.limit(3);
qb.cache(true);
return qb.getRawMany();
};```
[1]: https://i.stack.imgur.com/MghEJ.png
[2]: https://i.stack.imgur.com/qkXJK.png
TypeORM didn't meet my criteria, so I'm not experienced with it, but as long as it doesn't cause problems with TypeORM, I see an easy SQL solution and an almost as easy TypeScript solution.
The SQL solution is to simply not select the undesired columns. SQL will allow you to use fields you did not select in WHERE, GROUP BY, and/or ORDER BY clauses, though obviously you'll need to use 'SUM(count)' instead of 'sum' for the order. I have encountered some ORMs that are not happy with this though.
The TS solution is to map the return from qb.getRawMany() so that you only have the field you're interested in. Assuming getRawMany() is returning an array of objects, that would look something like this:
getRawMany().map(companyRecord => {return {company: companyRecord.company}});
That may not be exactly correct, I've taken the day off precisely because I'm sick and my brain is fuzzy enough I was making too many stupid mistakes, but the concept should work even if the code itself doesn't.
EDIT: Also note that map returns a new array, it does not modify the existing array, so you would use this in place of the getRawMany() when assigning, not after the assignment.

Is a SQL Injection Attack Possible in QLDB/PartiQL

This question came up in a code review in reference to a select query that is necessarily constructed using string interpolation (C#) and I can't seem to find a reference one way or the other. For example, a query might look something like:
var sql = "SELECT * FROM {someTable} WHERE {indexedField} = ?";
Because of the use of a param in the WHERE clause, I think this should be safe either way; however, it would be nice to have confirmation. A couple of unsophisticated attempts suggest that, even if an injection were attempted and the query ended up looking something like this
Select * from SomeTable; SELECT * FROM SomeOtherTable Where IndexedField = "1"
the engine would still error out on trying to run multiple queries.
Any particular reason string interpolation is required?
https://docs.aws.amazon.com/qldb/latest/developerguide/driver-quickstart-dotnet.html#driver-quickstart-dotnet.step-5 using parameter probably would best help prevent against sql injection.
Injections like Select * from SomeTable; SELECT * FROM SomeOtherTable Where IndexedField = "1" would indeed error out because QLDB driver requires one txn.Execute() per query.
To reduce the risk of an injection, I would recommend:
sanitizing string interpolation to reject potentially malicious parameters
leveraging the QLDB feature that allows separation of access by PartiQL command and ledger table using IAM policies, https://aws.amazon.com/about-aws/whats-new/2021/06/amazon-qldb-supports-iam-based-access-policy-for-partiql-queries-and-ledger-tables/
For the second option, you can define permissions for certain table to reject unwanted access in case of an injection attempt.

Is there a way to pass a parameter to google bigquery to be used in their "IN" function

I'm currently writing an app that accesses google bigquery via their "#google-cloud/bigquery": "^2.0.6" library. In one of my queries I have a where clause where i need to pass a list of ids. If I use UNNEST like in their example and pass an array of strings, it works fine.
https://cloud.google.com/bigquery/docs/parameterized-queries
However, I have found that UNNEST can be really slow and just want to use IN on its own and pass in a string list of ids. No matter what format of string list I send, the query returns null results. I think this is because of the way they convert parameters in order to avoid sql injection. I have to use a parameter because I, myself want to avoid SQL injection attacks on my app. If i pass just one id it works fine, but if i pass a list it blows up so I figure it has something to do with formatting, but I know my format is correct in terms of what IN would normally expect i.e. IN ('', '')
Has anyone been able to just pass a param to IN and have it work? i.e. IN (#idParam)?
We declare params like this at the beginning of the script:
DECLARE var_country_ids ARRAY<INT64> DEFAULT [1,2,3];
and use like this:
WHERE if(var_country_ids is not null,p.country_id IN UNNEST(var_country_ids),true) AND ...
as you see we let NULL and array notation as well. We don't see issues with speed.

Knex + SQL Server whereIn query 8-12s -- raw version returns NO results but if I input the .toQuery() result directly I get results

The database is in Azure cloud and not being used in production currently. There are 80.000 rows and a uprn is a VARCHAR(100);
I'm already using JOI to validate each UPRN as well;
I'm using KNEX with a SQL Server database with the following whereIn query:
knex(LOCATIONS.table).whereIn(LOCATIONS.uprn, req.body.uprns)
but this takes 8-12s to complete and sometimes timesout. if I use .toQuery() on the same thing, SSMS will return the result within 1-2.
If I do a raw query, the resulting .toQuery() or toString() works in SSMS and returns results. But if I try to use the raw directly, it will return 0 results.
I'm looking to either fix what's making whereIn so slow or get the raw query working.
EDIT 1:
After much debugging and trying -- it seems that the bug is due to how knex deals with arrays, so I made a for-of loop to add ? ? ? for each array element and then inputed the array for all params.
This led me to realizing the performance issue is due to SQL server way of parameterising.
I ended up building a raw query string with all of the parameters and validating the input with Joi string/regex config:
Joi.string()
.min(1)
.max(35)
.regex(/^[a-z\d\-_\s]+$/i)
allowing only for alphanumeric, dashes and spaces which should prevent sql injection.
I'm going to look deeper into security issues with this and might make a separate login that can only SELECT data from that table and nothing more to run with these queries.
Needed to just handle it raw and validate separately.

Select only one row in dql subquery

I have to execute following query:
create dm_myobject object
set my_id_attribute = (select r_object_id from dm_otherobject where <some clause here>)
where ...
But subquery in brackets returns more than one id. I can't make whereclause more detailed to retrieve only one value.
How to take first?
ENABLE(FETCH_ALL_RESULTS 1) or ENABLE(RETURN_TOP 1) doesn't help.
In my experience it is impossible to use DQL hints in a sub query like you suggested, because the hint is applied to the query as a whole. It is indeed possible to use, say, ENABLE(RETURN_TOP 1) on a query that contains a sub query, however that hint will then be used on the outer query and never on the inner one. In your case, however, you'll end up with an error message telling that the sub query returns more than one result.
Try using an aggregate function on the selected attribute instead:
CREATE dm_myobject OBJECT
SET my_id_attribute = (
SELECT MIN(r_object_id)
FROM dm_otherobject
WHERE <some clause>
)
The MIN and MAX functions work with ints and strings, and I suspect they work with IDs too. Since it is ok for you to set only the first ID that's returned from your sub query, I suspect you're returning them in a sorted order and want to use the first -- hence the usage of the MIN function.
An alternative approach would of course be to write a script or a small Java program that executes several DQL statements, but that might or might not work for you in your case.

Resources