SPARQL CONSTRUCT trying to BIND yes/no values from conditional sub query - subquery

Continued on from another question here...
I have a(n excerpt from a) construct query below that is successfully pulling records as desired.
CONSTRUCT {
?publication fb:type ?type;
fb:publicationLabel ?publicationLabel;
fb:publicationType ?publicationTypeLabel;
fb:publicationLink ?publicationLink;
}
WHERE {
?publication a bibo:Document .
?publication rdfs:Label ?publicationLabel .
?publication vitro:mostSpecificType ?publicationType .
?publicationType rdfs:Label ?publicationTypeLabel .
?publication obo:ARG_2000028 ?vcard .
?vcard vcard:hasURL ?urllink .
?urllink vcard:url ?publicationLink
}
The above query (trimmed down a bit) currently works fine. I’m now trying to add the following variable: fb:linkInternalExists
To this variable, I want to bind the output of a conditional subquery that looks for a value (we’ll say “internal.url” for this example) within all the possible ?publicationLink values for a specific ?publication.
So the RDF output with the desired addition could return something like the following:
<rdf:Description rdf:about="https://abcd.fgh/individual/publication12345">
<fb:publicationLabel>example record 1</fb:publicationLabel>
<fb:publicationType>journal</fb:publicationType>
<fb:publicationLink>http://external.url/bcde</fb:publicationType>
<fb:publicationLink>http://external.url/abcd</fb:publicationType>
<fb:linkInternalExists>No</fb:linkInternalExists>
</rdf:Description>
<rdf:Description rdf:about="https://abcd.fgh/individual/publication23456">
<fb:publicationLabel>example record 2</fb:publicationLabel>
<fb:publicationType>conference paper</fb:publicationType>
<fb:publicationLink>http://external.url/2345</fb:publicationType>
<fb:publicationLink>http://external.url/1234</fb:publicationType>
<fb:publicationLink>http://internal.url/1234</fb:publicationType>
<fb:linkInternalExists>Yes</fb:linkInternalExists>
</rdf:Description>
My attempts at adding the required subquery to the above, and successfully bind its output to fb:linkInternalExists, have been unsuccessful. So my question is what would the modified query look like.
Regards

You don't actually need a subquery for this. All you need is an OPTIONAL pattern combined with a BIND expression.
The optional pattern should specifically look to find an internal link, like so:
OPTIONAL {
?vcard vcard:hasURL ?internal .
?internal vcard:url ?internalLink .
FILTER(CONTAINS(STR(?internalLlink), "internal.url")
}
or more concisely:
OPTIONAL {
?vcard vcard:hasURL/vcard:url ?internalLink .
FILTER(CONTAINS(STR(?internalLlink), "internal.url")
}
This clause will bind a value to ?internalLink if such a link exists, and leave it unbound otherwise. To then convert that to the output form you want, you can add the following conditional BIND-clause:
BIND (IF(BOUND(?internalLink), "Yes", "No") as ?internalLinkExists)
And then of course finally add the following to your CONSTRUCT-clause:
?publication fb:linkInternalExists ?internalLinkExists .

Upon trying Jeen Broekstra's approach, the query timed out, but it led me to trying other ways to isolate for the internalLink.
I tried the following instead, pulling both the publicationLink and the internalLink variables from distinct UNIONs.
{
?publication a bibo:Document.
?publication obo:ARG_2000028 ?vcard.
?vcard vcard:hasURL ?urllink.
?urllink vcard:url ?publicationLink .
}
UNION {
?publication a bibo:Document .
?publication obo:ARG_2000028 ?vcard .
?vcard vcard:hasURL/vcard:url ?internalLink .
FILTER(CONTAINS(STR(?internalLink), "internal.url"))
}
BIND (IF(BOUND(?internalLink), "Yes", "No") as ?internalLinkExists)
This successfully returned values for ?internalLink, and then the BIND added the Yes/No variable. Job done!

Related

AQL: Bind parameter on operator

Is there a way to have bind parameter on operator ("<", "<=" etc ...) ? I'm working on a Foxx service.
Example :
const operator = '<'
const res = query`
FOR v IN myCollection
FILTER v.value ${operator} ${maxValue}
`
I can do it with db._query :
const operator = '<'
const res = db._query('
FOR v IN myCollection
FILTER v.value ${operator} #maxValue'
{ maxValue: 100 })
Normal bind parameters (with one #) can only be used for the values null, true, false, numbers, strings, arrays and objects.
Collection bind parameters (with two ##) can be used where collection names are specified.
Passing an operator via bind parameters is not possible in AQL, as it could likely change the meaning of a query, or render it totally invalid.
Consider the following example:
FOR v IN myCollection
FILTER v.value #operator #maxValue
This query does not even parse, regardless of what values are passed in the bind parameters. And this is a good thing, because otherwise one may pass something like #operator: "abc", #maxValue: ">=", which would mean the query can be parsed fine without bind parameters, but would produce a parse error with bind parameters injected.
So the easiest solution here is to inject the comparison operator into the query via template string substituion, though of course you need to make sure the requested comparison operator is in a whitelisted of allowed operators.
But you would need to do this even with bind parameters, as otherwise people could just send #operator: "!=" or #operator: "NOT IN" or other operators which you either don't expect or that can make your query more expensive.

SPARQL DBpedia - Retrieve category information in any language by using labels

I have a problem, which I explain on following example:
I want to retrieve all information in any language on a category. I must use ?category as a label and the language labels en, as they are inputs in my program.
The query looks like this, but when I change the language I don't receive any information on the category. I know the problem lies in the dcterms:subject, because ?category returns http://dbpedia.org/resource/Category:Countries_in_Europe (see first example below).
For example to search for a category label in german you have to use http://de.dbpedia.org/resource/Kategorie:Staat_in_Europa (see second example below).
prefix dcterms: <http://purl.org/dc/terms/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?objectLabel WHERE {
?subject dcterms:subject ?category ; rdfs:label ?objectLabel
?category rdfs:label "Countries in Europe"#en .
FILTER (LANG(?objectLabel)='en')
}
Equivalent query in different language that doesn't work as example:
prefix dcterms: <http://purl.org/dc/terms/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?objectLabel WHERE {
?subject dcterms:subject ?category ; rdfs:label ?objectLabel
?category rdfs:label "Staat in Europa"#de .
FILTER (LANG(?objectLabel)='de')
}
Is there a similar or different way / method to solve the problem? Thanks in advance for any help.

elasticsearch nest groovy script where value doesn't exist

I'm trying to make a script that can modify my score. So I made this:
if (!(doc['score_mod'].empty)) {
_score * doc['score_mod'].value
}
but now I have a type called web_page that doesn't have the score_mod value and it's being generated via: https://github.com/codelibs/elasticsearch-river-web . So I can't mannually put the value in when it's being genereated.
Is there a way that I could have a static score for the web_page or have the groovy script check if that value exists?
The current code fails for the web_pages results, but for the ones with a score_mod value it works just fine
You should be able to use the elvis operator and the ?. shortcut operator like so:
_score * (doc['score_mod']?.value ?: 1)
So if doc['score_mod'] is null, or value is null (or zero, or empty) it will default to 1 (and multiply that by _score)

SPARQL how to deal with different cased queries?

I am still a bit new to SPARQL. I have set up a dbpedia endpoint for our company. I have no idea what the end user will be querying and, since DBpedia is case sensitive I pass both title case & uppercase versions for subjects vs something like a person; e.g. "Computer_programming" vs "Alcia_Keys". Rather than pass in 2 separate queries what is the most effecient way to achieve this? I've tried the IN operator (from this question) but I seem to be failing somewhere.
select ?label ?abstract where {
IN (<http://dbpedia.org/resource/alicia_keys>, <http://dbpedia.org/resource/Alicia_Keys>) rdfs:label ?label;
dbpedia-owl:abstract ?abstract.
}
LIMIT 1"""
since DBpedia is case sensitive I pass both title case & uppercase
versions for subjects vs something like a person; e.g.
"Computer_programming" vs "Alcia_Keys". Rather than pass in 2 separate
queries what is the most effecient way to achieve this?
URIs should be viewed as opaque. While DBpedia generally has some nice structure so that you can lucky by concatenating http://dbpedia.org/resource and some string with _ replacing , that's really not a very robust way to do something. A better idea is to note that the string you're getting is probably the same as a label of some resource, modulo variations in case. Given that, the best idea would be to look for something with the same label, modulo case. E.g.,
select ?resource where {
values ?input { "AliCIA KeYS" }
?resource rdfs:label ?label .
filter ( ucase(str(?label)) = ucase(?input) )
}
That's actually going to be pretty slow, though, because you'll have to find every resource, do some string processing on its label. It's an OK approach, in principle though.
What can be done to make it better? Well, if you know what kind of thing you're looking for, that will help a lot. E.g., you could restrict the query to Persons:
select distinct ?resource where {
values ?input { "AliCIA KeYS" }
?resource rdf:type dbpedia-owl:Person ;
rdfs:label ?label .
filter ( ucase(str(?label)) = ucase(?input) )
}
That's an improvement, but it's still not all that fast. It still, at least conceptually, has to touch each Person and examine their name. Some SPARQL endpoints support text indexing, and that's probably what you need if you want to do this efficiently.
The best option, of course, would be to simply ask your users for a little bit more information, and to normalize the data in advance. If your user provides "AliCIA KEyS", then you can do the normalization to "Alicia Keys"#en, and then do something ilke:
select distinct ?resource where {
values ?input { "Alicia Keys"#en }
?resource rdfs:label ?input .
}

Custom MongoDB search query

In my database I have documents which all contain the property foo. For each value of foo I have a function that either returns true or false. How can I query for all the documents for which the value of foo makes the function return true?
If you need to check if your string field's value is one of several, you need the $in modifier.
db.collection.find( { field : { $in : array } } );
It works fast and uses index (if possible).
If your field is an array and you pass a string, use this syntax.
db.collection.find({array_field : string_value});
It will check every element in the array and, if any of them matches your string, it will return the document.
You could use $where.
Example:
db.myCollection.find( { $where: "this.a > 3" });
db.myCollection.find( "this.a > 3" );
db.myCollection.find( { $where: function() { return this.a > 3;}});
Note, this is run in Javascript. This means two things.
You can put arbitrary Javacript into $where expression (the function form).
It'll be significantly slower than regular queries.
It really depends on what the function is and how you are using it. Is the function constant for any given record? Is it even a function you can evaluate on the database server? ...
In the extreme, if you need to check this value often, you might, for example, create a field that exists only when f(foo) is true and then create a sparse index on that field.
$where may well be the solution you are looking for, but depending on the access patterns there may be a better solution.

Resources