flux query: filter out all records related to one matching the condition - node.js

I'm trying to filter an influx DB query (using the nodeJS influxdb-client library).
As far as I can tell, it only works with "flux" queries.
I would like to filter out all records that share a specific attribute with any record that matches a particular condition. I'm filtering using the filter-function, but I'm not sure how I can continue from there. Is this possible in a single query?
My filter looks something like this:
|> filter(fn:(r) => r["_value"] == 1 and r["button"] == "1" ) and I would like to leave out all the record that have the same r["session"] as any that match this filter.
Do I need two queries; one to get those r["session"]s and one to filter on those, or is it possible in one?
Update:
Trying the two-step process. Got the list of r["session"]s into an array, and attempting to use the contains() flux function now to filter values included in that array called sessionsExclude.
Flux query section:
|> filter(fn:(r) => contains(value: r["session"], set: ${sessionsExclude}))
Getting an error unexpected token for property key: INT ("102")'. Not sure why. Looks like flux tries to turn the values into Integers? The r["session"] is also a String (and the example in the docs also uses an array of Strings)...

Ended up doing it in two queries. Still confused about the Strings vs Integers, but casting the value as an Int and printing out the array of r["session"] within the query seems to work like this:
'|> filter(fn:(r) => not contains(value: int(v: r["session"]), set: [${sessionsExclude.join(",")}]))'
Added the "not" to exclude instead of retain the values matching the array...

Related

Select one column from Type-ORM query - Node

I have a type ORM query that returns five columns. I just want the company column returned but I need to select all five columns to generate the correct response.
Is there a way to wrap my query in another select statement or transform the results to just get the company column I want?
See my code below:
This is what the query returns currently:
https://i.stack.imgur.com/MghEJ.png
I want it to return:
https://i.stack.imgur.com/qkXJK.png
const qb = createQueryBuilder(Entity, 'stats_table');
qb.select('stats_table.company', 'company');
qb.addSelect('stats_table.title', 'title');
qb.addSelect('city_code');
qb.addSelect('country_code');
qb.addSelect('SUM(count)', 'sum');
qb.where('city_code IS NOT NULL OR country_code IS NOT NULL');
qb.addGroupBy('company');
qb.addGroupBy('stats_table.title');
qb.addGroupBy('country_code');
qb.addGroupBy('city_code');
qb.addOrderBy('sum', 'DESC');
qb.addOrderBy('company');
qb.addOrderBy('title');
qb.limit(3);
qb.cache(true);
return qb.getRawMany();
};```
[1]: https://i.stack.imgur.com/MghEJ.png
[2]: https://i.stack.imgur.com/qkXJK.png
TypeORM didn't meet my criteria, so I'm not experienced with it, but as long as it doesn't cause problems with TypeORM, I see an easy SQL solution and an almost as easy TypeScript solution.
The SQL solution is to simply not select the undesired columns. SQL will allow you to use fields you did not select in WHERE, GROUP BY, and/or ORDER BY clauses, though obviously you'll need to use 'SUM(count)' instead of 'sum' for the order. I have encountered some ORMs that are not happy with this though.
The TS solution is to map the return from qb.getRawMany() so that you only have the field you're interested in. Assuming getRawMany() is returning an array of objects, that would look something like this:
getRawMany().map(companyRecord => {return {company: companyRecord.company}});
That may not be exactly correct, I've taken the day off precisely because I'm sick and my brain is fuzzy enough I was making too many stupid mistakes, but the concept should work even if the code itself doesn't.
EDIT: Also note that map returns a new array, it does not modify the existing array, so you would use this in place of the getRawMany() when assigning, not after the assignment.

AND occasionally produces wrong result on shell and node native driver

I’ve built a dynamic query generator to create my desired queries based on many factors, however, in rare cases it acted weird. After a day on reading logs I found a situation that can be simplified in this:
db.users.find({att: ‘a’, att: ‘b’})
What I expect is that mongodb by default uses AND, so above query’s result should be an empty array. However, it’s not!
But when I use AND explicitly, the result is an empty array
db.users.find({$and: [{att: 'a'}, {att: ‘b'}]})
In javascript object’s keys should be unique, otherwise the value will be replaced by the latest value
(also mongodb shell is based on js, so it follows some rules of js)
const t = {att: 'a', att: 'b'};
console.log(t);
So in your case your query is acting like this:
db.users.find({att: ‘b’})
You’ve to handle this situation on your code if you want the result be empty in the mentioned condition

Is there a way to pass a parameter to google bigquery to be used in their "IN" function

I'm currently writing an app that accesses google bigquery via their "#google-cloud/bigquery": "^2.0.6" library. In one of my queries I have a where clause where i need to pass a list of ids. If I use UNNEST like in their example and pass an array of strings, it works fine.
https://cloud.google.com/bigquery/docs/parameterized-queries
However, I have found that UNNEST can be really slow and just want to use IN on its own and pass in a string list of ids. No matter what format of string list I send, the query returns null results. I think this is because of the way they convert parameters in order to avoid sql injection. I have to use a parameter because I, myself want to avoid SQL injection attacks on my app. If i pass just one id it works fine, but if i pass a list it blows up so I figure it has something to do with formatting, but I know my format is correct in terms of what IN would normally expect i.e. IN ('', '')
Has anyone been able to just pass a param to IN and have it work? i.e. IN (#idParam)?
We declare params like this at the beginning of the script:
DECLARE var_country_ids ARRAY<INT64> DEFAULT [1,2,3];
and use like this:
WHERE if(var_country_ids is not null,p.country_id IN UNNEST(var_country_ids),true) AND ...
as you see we let NULL and array notation as well. We don't see issues with speed.

Search Documents from two collections in MarkLogic

In Marklogic, I want to search between two collections by joining the id element of doc from collection1 to id element of doc from collection2. When it is matched i need the resulting document from both collections.
I have the below code, but it is very slow. How to use cts:search or search:search to achieve the same
for $i in collection('demographic')/individual,
$j in collection('membership')/membership[enrolleIndividualId/id/text() = $i/individual/id/text()])
return {$i,$j}
Update:
I should note that your sample is not valid XQuery: return element root { $i, $j } would be valid. Also, you should not use the /text() node selector, as it's behavior can be counterintuitive. You can compare elements directly in an XPath predicate ([enrolleIndividualId/id eq $i/individual/id]). Use /fn:string() in place of /text() if you need the contents of an element as a string. I'd also recommend using the atomic equality operator eq in place of the sequence equality operator = when directly comparing individual elements.
Original Answer:
There are several approaches to implementing joins in MarkLogic, but I would first question your data model. From the names of the elements in your sample query, it looks like you are using a relational model (individuals have memberships). MarkLogic is a document database, and it's optimized for denormalized documents. You will be much better served to process your data and generate new individual documents that each contain the relevant membership data.
That being said, here's how you could join your documents:
First, you will need range indices to write performant joins. If the id element from your sample query is not unique to individuals, you will need path range indices on enrolledIndividualId/id and individual/id, otherwise, a simple element range index on id will do.
The most common join pattern in MarkLogic uses a "shotgun-OR" query; first retrieving values from the lexicon backing a range index, and then constructing an or-query from those values to retrieve the relevant documents. This won't work directly in your case, as you want to retrieve both sides of the join. You can either run a search for each pair of documents, or run a single search for one side, and then an additional document read for each document.
pairs:
for $value in cts:values(cts:path-reference("individual/id"))
return
cts:search(/,
cts:or-query((
cts:and-query((
cts:collection-query("demographic"),
cts:path-range-query("individual/id", "=", $value))),
cts:and-query((
cts:collection-query("membership"),
cts:path-range-query("enrolledIndividualId/id", "=", $value))))),
"unfiltered")
shotgun-OR plus iteration:
for $doc in
cts:search(/,
cts:and-query((
cts:collection-query("demographic"),
cts:path-range-query("individual/id", "=",
cts:values(cts:path-reference("individual/id"))))),
"unfiltered")
return
cts:search(/,
cts:and-query((
cts:collection-query("membership"),
cts:path-range-query("enrolledIndividualId/id", "=", $doc/individual/id))),
"unfiltered")
As you can see, each approach requires I/O proportionate to the number of docs/values you want to join. If you only needed the shotgun-OR (ie, a query for documents based on criteria from other documents), you would only need to make two requests, the initial cts:values() call to retrieve values from a lexicon, and the cts:search() call using a query built from those values.
Note: the cts:query objects used in these examples could be used in conjunction with the Search API by means of the search:resolve() function.
Given your apparent data model, you will be much better served by processing your data into individual, de-normalized documents.

Couchdb filter using reduce functions/linked documents

Considering:
doc profile
{
_id:"1",
name:"john",
likes: ["2222","1111"]
}
doc likes
{
_id:"2222",
value:"true"
}
{
_id:"1111",
value:"false"
}
I have a filter on my xamarin app to get the profile, and it works well but I need to include the "children" (linked) docs... I can do this with a view setting include_docs=true but I want couchdb to filter so I can use replication.
Also, it would be possible to accomplish the same result if I could use a reduce function to filter data, but I can't make the filter use the reduce function.. So, any idea?
the expected result would be:
doc profile
{
_id:"1",
name:"john",
likes: {
{_id:"2222",
value:"true"},
{_id:"1111",
value:"false"]
}
}
Thanks!
I can do this with a view setting include_docs=true but I want couchdb to filter so I can use replication
You might already know this but you can use couchdb views as filters.
Also, it would be possible to accomplish the same result if I could use a reduce function to filter data
The reduce function is for "reducing" the values that are returned by the map function. The map function returns a key and a value like so:
emit(key,value)
The reduce function only gets the keys and the values that are returned from a map function. For example if you call a view with
?key=abc
and it returns results like
[{
_id:...,
type: abc
},
{
_id:...,
type:abc
}
....
]
You already have all the documents filtered by the key "abc". The reduce function will get as inputs the key, the value and a rereduce parameters. If you use the reduce function as a post map processing step to further filter the results from the view there will be two problems:
There is no way to pass a parameter to a reduce. The keys that you specify will only be used by the map function and then passed as they are to reduce.
It is not a good idea anyway. With reduce you want to return a small value that aggregates the results you get from a view. So taking the above example if you return say an integer as a value from the map function ( in emit(key,value)//suppose that the value is an integer) the reduce function may return a sum or aggregate of those values. But trying to return a modified document is not what reduce function is for. From the docs
"A reduce function must reduce the input values to a smaller output value. If you are building a composite return structure in your reduce, or only transforming the values field, rather than summarizing it, you might be misusing this feature. "
List functions might be more suited to what you are trying to do. If you want to process the results of the view query before returning them they are they way to go.
In list functions you get a set of results returned by the view function. You can even pass additional parameters if you'd like to apply complex filters on them. But you won't be able to use list functions for replication.
Finally replication works on a document level. Documents have _rev fields that is used by the replicator process to check what version the document is in before the replication is performed. So you won't be able to replicate the results returned by a view. Only the documents will be replicated.

Resources