Using Groovy find String from Exclusion List - groovy

In groovy, I want to search text (which is typically an xml structure) and find an occurrence of the ignore list.
For example:
My different search data requests are (reduced for clarity, but most are large):
<CustomerRQ field='a'></CustomerRQ>
<AddressRQ field='a'></AddressRQ>
My ignore list is:
CustomerRQ
CustomerRS
Based on the above two incoming requests of "customer" and "address", I want to ignore "Customer" since it's in my ignore list, but I want to identify "address" as a hit.
The overall intent is to use this for logging. I want to not log some incoming requests based on my "ignore" list, but all others will be logged.
Here's some pseudo code that may be on the right track but not really.
def list = ["CustomerRQ", "CustomerRS"]
println(list.contains("<CustomerRQ field='a'>"))
I'm not sure, but I think a closure will work in this case, but learning the groovy ropes here. Maybe a regexp will work as well. But the importance is to search in the incoming string (indexOf, exists...) across all of my exclusions list.

A quick solution:
shouldIgnore = list.inject(false) { bool, val -> bool || line.contains(val) }
Whether or not this is the best idea depends on information we don't have; it may be better to do something more-XMLy rather than checking against a string.

Related

Is there a way to pass a parameter to google bigquery to be used in their "IN" function

I'm currently writing an app that accesses google bigquery via their "#google-cloud/bigquery": "^2.0.6" library. In one of my queries I have a where clause where i need to pass a list of ids. If I use UNNEST like in their example and pass an array of strings, it works fine.
https://cloud.google.com/bigquery/docs/parameterized-queries
However, I have found that UNNEST can be really slow and just want to use IN on its own and pass in a string list of ids. No matter what format of string list I send, the query returns null results. I think this is because of the way they convert parameters in order to avoid sql injection. I have to use a parameter because I, myself want to avoid SQL injection attacks on my app. If i pass just one id it works fine, but if i pass a list it blows up so I figure it has something to do with formatting, but I know my format is correct in terms of what IN would normally expect i.e. IN ('', '')
Has anyone been able to just pass a param to IN and have it work? i.e. IN (#idParam)?
We declare params like this at the beginning of the script:
DECLARE var_country_ids ARRAY<INT64> DEFAULT [1,2,3];
and use like this:
WHERE if(var_country_ids is not null,p.country_id IN UNNEST(var_country_ids),true) AND ...
as you see we let NULL and array notation as well. We don't see issues with speed.

AEM Query builder exclude a folder in search

I need to create a query where the params are like:
queryParams.put("path", "/content/myFolder");
queryParams.put("1_property", "myProperty");
queryParams.put("1_property.operation", "exists");
queryParams.put("p.limit", "-1");
But, I need to exclude a certain path inside this blanket folder , say: "/content/myFolder/wrongFolder" and search in all other folders (whose number keeps on varying)
Is there a way to do so ? I didn't find it exactly online.
I also tried the unequals operation as the parent path is being saved in a JCR property, but still no luck. I actually need unlike to avoid all occurrences of the path. But there is no such thing:
path=/main/path/to/search/in
group.1_property=cq:parentPath
group.1_property.operation=unequals
group.1_property.value=/path/to/be/avoided
group.2_property=myProperty
group.2_property.operation=exists
group.p.or=true
p.limit=-1
This is an old question but the reason you got more results later lies in the way in which you have constructed your query. The correct way to write a query like this would be something like:
path=/main/path/where
property=myProperty
property.operation=exists
property.value=true
group.p.or=true
group.p.not=true
group.1_path=/main/path/where/first/you/donot/want/to/search
group.2_path=/main/path/where/second/you/donot/want/to/search
p.limit=-1
A couple of notes: your group.p.or in your last comment would have applied to all of your groups because they weren't delineated by a group number. If you want an OR to be applied to a specific group (but not all groups), you would use:
path=/main/path/where
group.1_property=myProperty
group.1_property.operation=exists
group.1_property.value=true
2_group.p.or=true
2_group.p.not=true
2_group.3_path=/main/path/where/first/you/donot/want/to/search
2_group.4_path=/main/path/where/second/you/donot/want/to/search
Also, the numbers themselves don't matter - they don't have to be sequential, as long as property predicate numbers aren't reused, which will cause an exception to be thrown when the QB tries to parse it. But for readability and general convention, they're usually presented that way.
I presume that your example was just thrown together for this question, but obviously your "do not search" paths would have to be children of the main path you want to search or including them in the query would be superfluous, the query would not be searching them anyway otherwise.
AEM Query Builder Documentation for 6.3
Hope this helps someone in the future.
Using QueryBuilder you can execute:
map.put("group.p.not",true)
map.put("group.1_path","/first/path/where/you/donot/want/to/search")
map.put("group.2_path","/second/path/where/you/donot/want/to/search")
Also I've checked PredicateGroup's class API and they provide a setNegated method. I've never used it myself, but I think you can negate a group and combine it into a common predicate with the path you are searching on like:
final PredicateGroup doNotSearchGroup = new PredicateGroup();
doNotSearchGroup.setNegated(true);
doNotSearchGroup.add(new Predicate("path").set("path", "/path/where/you/donot/want/to/search"));
final PredicateGroup combinedPredicate = new PredicateGroup();
combinedPredicate.add(new Predicate("path").set("path", "/path/where/you/want/to/search"));
combinedPredicate.add(doNotSearchGroup);
final Query query = queryBuilder.createQuery(combinedPredicate);
Here is the query to specify operator on given specific group id.
path=/content/course/
type=cq:Page
p.limit=-1
1_property=jcr:content/event
group.1_group.1_group.daterange.lowerBound=2019-12-26T13:39:19.358Z
group.1_group.1_group.daterange.property=jcr:content/xyz
group.1_group.2_group.daterange.upperBound=2019-12-26T13:39:19.358Z
group.1_group.2_group.daterange.property=jcr:content/abc
group.1_group.3_group.relativedaterange.property=jcr:content/courseStartDate
group.1_group.3_group.relativedaterange.lowerBound=0
group.1_group.2_group.p.not=true
group.1_group.1_group.p.not=true

Best practice to pass query conditions in ajax request

I'm writing a REST api in node js that will execute a sql query and send the results;
in the request I need to send the WHERE conditions; ex:
GET 127.0.0.1:5007/users //gets the list of users
GET 127.0.0.1:5007/users
id = 1 //gets the user with id 1
Right now the conditions are passed from the client to the rest api in the request's headers.
In the API I'm using sequelize, an ORM that needs to receive WHERE conditions in a particular form (an object); ex: having the condition:
(x=1 AND (y=2 OR z=3)) OR (x=3 AND y=1)
this needs to be formatted as a nested object:
-- x=1
-- AND -| -- y=2
| -- OR ----|
| -- z=3
-- OR -|
|
| -- x=3
-- AND -|
-- y=1
so the object would be:
Sequelize.or (
Sequelize.and (
{x=1},
Sequelize.or(
{y=2},
{z=3}
)
),
Sequelize.and (
{x=3},
{y=1}
)
)
Now I'm trying to pass a simple string (like "(x=1 AND (y=2 OR z=3)) OR (x=3 AND y=1)"), but then I will need a function on the server that can convert the string in the needed object (this method in my opinion has the advantage that the developer writing the client, can pass the where conditions in a simple way, like using sql, and this method is also indipendent from the used ORM, with no need to change the client if we need to change the server or use a different ORM);
The function to read and convert the conditions' string into an object is giving me headache (I'm trying to write one without success, so if you have some examples about how to do something like this...)
What I would like to get is a route capable of executing almost any kind of sql query and give the results:
now I have a different route for everything:
127.0.0.1:5007/users //to get all users
127.0.0.1:5007/users/1 //to get a single user
127.0.0.1:5007/lastusers //to get user registered in the last month
and so on for the other tables i need to query (one route for every kind of request I need in the client);
instead I would like to have only one route, something like:
127.0.0.1:5007/request
(when calling this route I will pass the table name and the conditions' string)
Do you think this solution would be a good solution or you generally use other ways to handle this kind of things?
Do you have any idea on how to write a function to convert the conditions' string into the desired object?
Any suggestion would be appreciated ;)
I would strongly advise you not to expose any part of your database model to your clients. Doing so means you can't change anything you expose without the risk of breaking the clients. One suggestion as far as what you've supplied is that you can and should use query parameters to cut down on the number of endpoints you've got.
GET /users //to get all users
GET /users?registeredInPastDays=30 //to get user registered in the last month
GET /users/1 //to get a single user
Obviously "registeredInPastDays" should be renamed to something less clumsy .. it's just an example.
As far as the conditions string, there ought to be plenty of parsers available online. The grammar looks very straightforward.
IMHO the main disadvantage of your solution is that you are creating just another API for quering data. Why create sthm from scratch if it is already created? You should use existing mature query API and focus on your business logic rather then inventing sthm new.
For example, you can take query syntax from Odata. Many people have been developing that standard for a long time. They have already considered different use cases and obstacles for query API.
Resources are located with a URI. You can use or mix three ways to address them:
Hierarchically with a sequence of path segments:
/users/john/posts/4711
Non hierarchically with query parameters:
/users/john/posts?minVotes=10&minViews=1000&tags=java
With matrix parameters which affect only one path segment:
/users;country=ukraine/posts
This is normally sufficient enough but it has limitations like the maximum length. In your case a problem is that you can't easily describe and and or conjunctions with query parameters. But you can use a custom or standard query syntax. For instance if you want to find all cars or vehicles from Ford except the Capri with a price between $10000 and $20000 Google uses the search parameter
q=cars+OR+vehicles+%22ford%22+-capri+%2410000..%2420000
(the %22 is a escaped ", the %24 a escaped $).
If this does not work for your case and you want to pass data outside of the URI the format is just a matter of your taste. Adding a custom header like X-Filter may be a valid approach. I would tend to use a POST. Although you just want to query data this is still RESTful if you treat your request as the creation of a search result resource:
POST /search HTTP/1.1
your query-data
Your server should return the newly created resource in the Location header:
HTTP/1.1 201 Created
Location: /search/3
The result can still be cached and you can bookmark it or send the link. The downside is that you need an additional POST.

Referencing external doc in CouchDB view

I am scraping an 90K record database using JSON-RPC and I am trying to put in some basic error checking. I want to start by scraping the database twice using two different settings and adding a prefix to the second scrape. This way I can check to ensure that the two settings are not producing different records (due to dropped updates, etc). I wanted to implement the comparison using a view which compares each document from the first scrape with it's twin produced by the second scrape and then emit the names of records with a difference between them.
However, I cannot quite figure out how to pull in another doc in the view, everything I have read only discusses external docs using the emit() function, which is too late to permit me to compare it. In the example below, the lookup() function would grab the referenced document.
Is this just not possible?
function(doc) {
if(doc._id.slice(0,1)!=='$' && doc._id.slice(0,1)!== "_"){
var otherDoc = lookup('$test" + doc._id);
if(otherDoc){
var keys = doc.value.keys();
var same = true;
keys.forEach(function(key) {
if ((key.slice(0,1) !== '_') && (key.slice(0,1) !=='$') && (key!=='expires')) {
if (!Object.equal(otherDoc[key], doc[key])) {
same = false;
}
}
});
if(!same){
emit(doc._id, 1);
}
}
}
}
Context
You are correct that this is not possible in CouchDB. The whole point of the map function is that it must be idempotent, otherwise you lose all the other nice benefits of a pre-calculated index.
This is why you cannot access external resources in the map function, whether they be other records or the clock. Any time you run a map you must always get the same result if you put the same record into it. Since there are no relationships between records in CouchDB, you cannot promise that this is possible.
Solution
However, you can still achieve your end goal, just be different means. Some possibilities...
Assuming there is some meaningful numeric value in each doc, you could use a view to take the sum of all those values and group them by which import you did ({key: <batch id>, value: <meaningful number>}). Then compare the two numbers in your client or the browser to see if they match.
A brute force approach would be to use a view to pair the docs that should match. Each doc is on a different row, but they're grouped by a common field. Then iterate through the entire index comparing the pairs. This would certainly be the quickest to code and doesn't depend on your application or data.
Implement a validation function to enforce a schema on your data. Just be warned that this will reduce your write throughput since each written record will be piped out of Erlang and into the JS engine. Also, this is only applicable if you're worried about properly formed records instead of their precise content, which might not be the case.
Instead of your different batch jobs creating different docs, have them place them into the same doc. The structure might look like this: { "_id": "something meaningful", "batch_one": { ..data.. }, "batch_two": { ..data.. } } Then your validation function could compare them or you could create a view that indexes all the docs that don't match. All depends on where in your pipeline you want to do the error checking and correction.
Personally I like the last option better, but only if you don't plan to use the database as is in production. Ie., you wouldn't want to carry around all that extra data in each record.
Hope that helps.
Cheers.

Share data among rules using a Map

This should be simple, but I am still lost.
There is a very similar post here: How to share data between Drools rules in a map?
but it doesn't fix my problem:
I have a set of rules and, before launching them, I insert a Map<String, Object> as a fact.
In these rules I use the map to write some conclusions like:
when
$map : Map();
something ocurrs;
then
$map.put("conclusion1", 100);
Now I would like to use these intermediate conclusions in other rules, something like:
when
$map : Map(this["conclusion1"] > 50)
then
do something cool;
The problem is that when I execute the rules it is like the second rule doesn't see the conclusions of the first one, and it does not fire.
I have tried putting a break point and analysing the working memory, and, in fact, the Map would contain the conclusion1, 100 after the first rule is fired.
I have also tried by making an update($map) in the conclusion, but that would trigger an infinite loop.
Any idea of why this wouldn't work, or any alternative solution to my problem?
Thanks !
When you modify a fact you need to notify the engine you are doing so. One of the ways of doing that is by using modify(). E.g.:
when
$map : Map();
something ocurrs;
then
modify( $map ) {
put("conclusion1", 100)
}
end

Resources