WikiData SPARQL times out with subquery + labels service + OPTIONAL - subquery

While trying to answer this question: How to filter results of wikidata to specific language , I have encountered the following problem:
Some countries have more than one capital. This query randomly chooses only one capital per country:
SELECT ?country (sample(?capital) as ?aCapital) WHERE {
?country wdt:P31 wd:Q3624078.
FILTER NOT EXISTS {?country wdt:P31 wd:Q3024240} # not a former country
?country wdt:P36 ?capital.
}
GROUP BY ?country
Try it here
However, while trying to add labels and coordinates, the query times out:
SELECT ?country ?countryLabel ?aCapital ?aCapitalLabel ?coords WHERE {
OPTIONAL {?aCapital wdt:P625 ?coords.}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
{
SELECT ?country (sample(?capital) as ?aCapital) WHERE {
?country wdt:P31 wd:Q3624078.
FILTER NOT EXISTS {?country wdt:P31 wd:Q3024240} # not a former country
?country wdt:P36 ?capital.
}
GROUP BY ?country
}
}
ORDER BY ?countryLabel
LIMIT 1000
Try it here

Following the comments By #AKSW Above - OPTIONAL in SPARQL is a left join.
Reordering the subquery and OPTIONAL solves the problem:
SELECT ?country ?countryLabel ?aCapital ?aCapitalLabel ?coords WHERE {
{
{
SELECT ?country (sample(?capital) as ?aCapital) WHERE {
?country wdt:P31 wd:Q3624078.
FILTER NOT EXISTS {?country wdt:P31 wd:Q3024240} # not a former country
?country wdt:P36 ?capital.
}
GROUP BY ?country
}
OPTIONAL {?aCapital wdt:P625 ?coords.}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
}
ORDER BY ?countryLabel
LIMIT 1000
Try it here.
Please note that this requires adding an additional { + } to keep the syntax correct.
See also: SPARQL Optional query

Related

Nodejs Elasticsearch query default behaviour

On a daily basis, I'm pushing data (time_series) to Elasticsearch. I created an index pattern, and my index have the name: myindex_* , where * is today date (an index pattern has been setup). Thus after a week, I have: myindex_2022-06-20, myindex_2022-06-21... myindex_2022-06-27.
Let's assume my index is indexing products' prices. Thus inside each myindex_*, I have got:
myindex_2022-06-26 is including many products prices like this:
{
"reference_code": "123456789",
"price": 10.00
},
...
myindex_2022-06-27:
{
"reference_code": "123456789",
"price": 12.00
},
I'm using this query to get the reference code and the corresponding prices. And it works great.
const data = await elasticClient.search({
index: myindex_2022-06-27,
body: {
query: {
match: {
"reference_code": "123456789"
}
}
}
});
But, I would like to have a query that if in the index of the date 2022-06-27, there is no data, then it checks, in the previous index 2022-06-26, and so on (until e.g. 10x).
Not sure, but it seems it's doing this when I replace myindex_2022-06-27 by myindex_* (not sure it's the default behaviour).
The issue is that when I'm using this way, I got prices from other index but it seems to use the oldest one. I would like to get the newest one instead, thus the opposite way.
How should I proceed?
If you query with index wildcard, it should return a list of documents, where every document will include some meta fields as _index and _id.
You can sort by _index, to make elastic search return the latest document at position [0] in your list.
const data = await elasticClient.search({
index: myindex_2022-*,
body: {
query: {
match: {
"reference_code": "123456789"
}
}
sort : { "_index" : "desc" },
}
});

How to filter Subscribers based on array of tags in Loopabck

I've two models - subscribers and tags
Sample data:
{
subscribers: [
{
name: "User 1",
tags: ["a","b"]
},
{
name: "User 2",
tags: ["c","d"]
}
]
}
I want to filter subscribers based on their tags.
If I give a and b tags, User 1 should list
If I give a and c tags,
both User 1 and User 2 should list
Here is what I tried:
Method 1:
tags is a column in subscribers model with array data type
/subscribers/?filter={"where":{"tags":{"inq":["a","b"]}}} // doesn't work
Method 2:
Created a separate table tags and set subscribers has many tags.
/subscribers/?filter={"where":{"tags":{"inq":["a","b"]}}} // doesn't work
How can I achieve this in Loopback without writing custom methods?
I've Postgresql as the connector
UPDATE
As mentioned in the loopback docs you should use inq not In
The inq operator checks whether the value of the specified property matches any of the values provided in an array. The general syntax is:
{where: { property: { inq: [val1, val2, ...]}}}
From this:
/subscribers/?filter={"where":{"tags":{"In":["a","b"]}}}
To this:
/subscribers/?filter={"where":{"tags":{"inq":["a","b"]}}}
Finally found a hack, using Regex! it's not a performant solution, but it works!!
{ "where": { "tags": { "regexp": "a|b" } } }

JSONB Query Sequelize

I have one table in postgres table and table structure is
ID
Name
Details
Context
CreatedDate
Where as Context is JSONB field and CreatedDate is a timestamp
I am saving data in Context this way {"trade": {"id": 102}, "trader": {"id": 100}}
I am trying to select record from Context based on trader id and this is my query
this.findAll({
where: {
context: {
$contains: {
trader: [{id: '100'}]
}
}
}
})
I tried nested keys as well but no result yeild.
this.findAll({
where: {
'context.trader.id': {
$eq: '100'
}
}
})
Can you please suggest how I can select the records based on my structure.
In continuity to that how I can get records based on two statements like adding createdtime in this where clause

Ember data - return two resources in one request

I want to implement search for e-shop. User enters text, and API returns products AND categories which matches search phrase.
How do I get products and categories in one request ?
I'm aware I could do
return Ember.RSVP.hash( {
products: this.store.find("product", {searchTerm: "banana"})
categories: this.store.find("category", {searchTerm: "banana"})
} );
but isn't there a way to do it in a single request in order to have a better performance ?
If you can modify you backend just create a new method for search this.store.find("searchResult", {searchTerm: "banana"})
Where searchresult would be something like
{ searchResult { products: [ ... ], categories: [ ... ] } }

Data modelling in Cassandra

I am trying out Cassandra and looking at ways to model our data in it. I have described our data store requirements along with my thoughts on how to model in Cassandra. Please let me know whether this makes sense and suggest changes.
Did quite some search on the web, but didn't get clear idea regarding how to model multi-valued column requirements and index it, which is quite a common requirement.
Any help would be greatly appreciated.
Our current data for each record:
{
‘id’ : <some uuid>,
‘title’ : text,
‘description’ text,
‘images’ : [{id : id1, ‘caption’: cap1}, {id : id2, ‘caption’: cap2}, ... ],
‘videos’ : [‘video id1’, video id2’, …],
‘keywords’ [‘keyword1’, ‘keyword2’,...]
updated_at: <timestamp>
}
Queries we would need
Lookup by id
Lookup by images.id
Lookup by keyword
All records where updated_at >
Our current model
Column Family: Article
id: uuid
title: varchar
description: varchar
images:
videos:
keywords:
updated_at:
updated_date: [eg: ‘2013-05-06:02’]
Column Family: Image-Article Index
{
‘id’ : <image id>,
‘article1 uuid’ : null,
‘article2 uuid’ : null,
...
}
Column Family: Keyword-Article Index
{
‘id’ : <keyword>,
‘article1 uuid’ : null,
‘article2 uuid’ : null,
...
}
Sample queries:
Lookup by id => straight forward
Lookup by images.id =>
ids = select * from ‘Image-Article Index’ where id=<image id>
select * from Article where id in (ids)
Lookup by keyword =>
ids = select * from ‘Keyword-Article Index’ where id=<image id>
select * from Article where id in (ids)
All records where updated_at > <some timestamp>
Cassandra doesn’t support range queries unless there is one equality condition on one of the indexed columns.
extract date and hour from given timestamp;
for each date:hour in start to current time
ids = select * from Article where update_date=date:hour and timestamp > <some timestamp>
select * from Article where id in (ids)

Resources