Import data from DBPedia into GraphDB - dbpedia

I am basically looking to use a SPARQL construct query to retrieve data from DBPEdia to a local version of GraphDB. The construct query should be able to map to as many relations and data related to music. I have tried running construct queries within the GraphDB Workbench but I am not exactly sure how to go about it.
In the online tutorials for GraphDB, they always import data using a file or an online resource, and I could not find any example where they get data directly in the database by using a construct query.
Any advice regarding this would be very appreciated. Thanks for taking the time to help.

GraphDB supports importing data already transformed into RDF data format. The easiest way to import data from an external endpoint like DBPedia is to use SPARQL federation. Here is a sample query that takes data from a remote endpoint and imports it to the currently selected GraphDB repository:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
INSERT {
?s ?p ?o2
}
WHERE {
# Execute the query against DBPedia's endpoint
SERVICE <http://dbpedia.org/sparql> {
SELECT ?s ?p ?o2
{
# Select all triples for Madonna
?s ?p ?o
FILTER (?s = <http://dbpedia.org/resource/Madonna_(entertainer)>)
# Hacky function to rewrite all Literals of type rdf:langStrings without a language tag
BIND (
IF (
(isLiteral(?o) && datatype(?o) = rdf:langString && lang(?o) = ""),
(STRDT(STR(?o), xsd:string)),
?o
)
AS ?o2
)
}
}
}
Unfortunately, DBPedia and the underlying database engine are notorious for not strictly complying with SPARQL 1.1 and RDF 1.1 specifications. The service returns RDF literals of type rdf:langString without a proper language tag:
...
<result>
<binding name="s"><uri>http://dbpedia.org/resource/Madonna_(entertainer)</uri></binding>
<binding name="p"><uri>http://dbpedia.org/property/d</uri></binding>
<binding name="o"><literal datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#langString">Q1744</literal></binding>
</result>
...
The only way to overcome this is to add an extra filter which rewrites them on the fly.

Related

Does jOOQ support putting names on SQL statements?

Is there a way to "tag" or put names on SQL statements in jOOQ so when I look at the Performance Insights of AWS RDS, I can see something more meaningful than the first 500 chars of the statement?
For example, Performance Insights shows that this query is taking a toll in my DB:
select "my_schema"."custs"."id", "my_schema"."custs"."other_id", "my_schema"."custs"."cid_id", "my_schema"."custs"."valid_since", "my_schema"."custs"."valid_until", "my_schema"."custs"."address", "my_schema"."custs"."address_id_1", "my_schema"."pets"."id", "my_schema"."pets"."cst_id", "my_schema"."pets"."tag", "my_schema"."pets"."name", "my_schema"."pets"."description", "my_schema"."pets"."owner", "my_schema"."pets"."created_on", "my_schema"."pets"."created_by", "my_schema"."pets"."modified_on",
But as it comes chopped, it's not straight-forward to know which jOOQ code generated this.
I would prefer to see something like this:
Customer - Pet Lookup
or:
(Customer - Pet Lookup) select "my_schema"."custs"."id", "my_schema"."custs"."other_id", "my_schema"."custs"."cid_id", "my_schema"."custs"."valid_since", "my_schema"."custs"."valid_until", "my_schema"."custs"."address", "my_schema"."custs"."address_id_1", "my_schema"."pets"."id", "my_schema"."pets"."cst_id", "my_schema"."pets"."tag", "my_schema"."pets"."name", "my_schema"."pets"."description", "my_schema"."pets"."owner", "my_schema"."pets"."created_on", "my_schema"."pets"."created_by", "my_schema"."pets"."modified_on",
There are at least two out of the box approaches to what you want to achieve, both completely vendor agnostic:
1. Use "hints"
jOOQ supports Oracle style hints using the hint() method, at least for SELECT statements. Write something like:
ctx.select(T.A, T.B)
.hint("/* my tag */")
.from(T)
.where(...)
The limitation here is the location of the hint, which is going to be right after the SELECT keyword. Not sure if this will work for your RDBMS.
2. Use an ExecuteListener
You can supply your Configuration with an ExecuteListener, which patches your generated SQL strings with whatever you need to be added:
class MyListener extends DefaultExecuteListener {
// renderEnd() is called after the SQL string is generated, but
// before the prepared statement is created, let alone executed
#Override
public void renderEnd​(ExecuteContext ctx) {
if (mechanismToDetermineIfTaggingIsNeeded())
ctx.sql("/* My tag */ " + ctx.sql());
}
}
Using regular expressions, you can place that tag at any specific location within your SQL string.

How to add collections in transformations when writing(creating) a Document in MarkLogic

I wrote a transformation in xquery which unquotes an XML-String and inserts a element with its content. This works fine.
I need to create a collection dependant on the root element of this element as well. I can't do this on new documents as xdmp:document-add-collections() is not working. How do I add the collection to new Documents in transformations?
Here my ServerSide xQuery Code:
xquery version "1.0-ml";
module namespace transform = "http://marklogic.com/rest-api/transform/smtextdocuments";
import module namespace mem = "http://xqdev.com/in-mem-update" at '/MarkLogic/appservices/utils/in-mem-update.xqy';
declare function transform(
$context as map:map,
$params as map:map,
$content as document-node()
) as document-node()
{
let $uri := base-uri($content)
let $doccont := $content/smtextdocuments/documentcontent
let $newcont := xdmp:unquote($doccont)
let $contname := node-name($newcont/*)
let $result := if ( exists($content/smtextdocuments/content))
then mem:node-replace($content/smtextdocuments/content, <content>11{$newcont}</content>)
else mem:node-insert-after($doccont, <content>{$newcont}</content>)
let $log := xdmp:log($content)
return (
$result,
xdmp:document-add-collections($uri, fn:string($contname)),
xdmp:document-remove-collections($uri, "raw")
)
};
The script ist running with the java api (4.0.4) create methode via parameter ServerTransform transform. As per documentation the transformation script is running before the document is stored in the Database.
Its a new document; I need to transform the content and then create the collection.
I can see the document after the create, the content is available. Just the collection is missing. I can try xdmp:document-insert method but is it correct writing the document while create is running?.
The transform mechanism of the Java API / REST API takes responsibility for the document write. At present, there's no way for the transform to supply collections to the writer. That would be a reasonable request for enhancement.
The transform shouldn't attempt to write the document, because the writer would also attempt to write the same document.
One alternative would be to transform the document in Java before writing it and specify the collection as part of the write request.
Another alternative would be to rewrite the transform as a resource service extension, implement the write within the resource service extension, and modify the Java client to send the document to the resource service extension.
Depending on the model, a final alternative might be to use a range index on an element within the document to collect documents into sets instead of using a collection on the document.
Hoping that helps,
What do you mean by "new documents"? Is the document already inserted into the MarkLogic database at the time you are adjusting the collections of it? If not, you may want to modify your return to ($result, xdmp:document-insert($uri, $result, xdmp:default-permissions(), fn:string($contname)) ) for that case.
Otherwise, can you edit your question to share the error or problem more specifically you are facing?
It is a pity that REST transforms do not allow this, like MLCP transforms do. Until changed you have the options drawn by ehennum, or you can consider delaying adding of collections to a pre- or post-commit trigger. It takes some overhead, but it sometimes makes perfect sense to do something like that in a trigger, since it makes sure it is always enforced, and a good place to do content validation, audit logging, and things like that as well.
HTH!

ActivePivot retrieve CSV output by sending MDX query through python?

How can I retrieve CSV output file by sending MDX query to ActivePivot by using python? Instead of XMLA or Web Services.
there is a POST endpoint to get CSV content from an MDX query, available since ActivePivot 5.4.
By calling http://<host>/<app>/pivot/rest/v3/cube/export/mdx/download with the following JSON payload:
{
"jsonMdxQuery":
{
"mdx" : "<your MDX query>",
"context" : {}
},
"separator" : ";"
}
you will receive the content of the answer as CSV with fields separated by ;.
However, note that the form of your MDX will impact the form of the CSV. For good results, I suggest you MDX queries in the form of:
SELECT
// Measures as columns
{[Measures].[contributors.COUNT], ...} ON COLUMNS
// Members on rows
[Currency].[Currency].[Currency].Members ON ROWS
FROM [cube]
It will generate a CSV as below:
[Measures].[Measures];[Currency].[Currency].[Currency];VALUE
contributors.COUNT;EUR;170
pnl.SUM;EUR;-8413.812452550741
...
Cheers
You can use the ActivePivot webservices or RESTful services then you write a python client and fire your MDX query:
With webservices: http://host:port/webapp/webservices
Look for IQueriesService the method executeMDX should help
or
with RESTful services: http://host:port/webapp/pivot/rest/v3/cube/query?_wadl
Look for
<resource path="mdx">
<method name="POST">
<request>
<representation mediaType="application/json"/>
</request>
<response>
<representation mediaType="application/json"/>
</response>
</method>
</resource>
You'll get the query result, loop over the retrieved records and build your own csv.
Another option (still with RESTful services) is to use the following endpoint
http://host:port/webapp/pivot/rest/v3/cube/export?_wadl
that allows you to export a query result in CSV directly.

Query in Override of node.tpl

I have override a node.tpl and need some results from db using a query generated by views.
Here is the code which i used:
<?php $res = db_query("SELECT node.nid AS nid, node.title AS node_title FROM node node LEFT JOIN content_field_is_popular node_data_field_is_popular ON node.vid
= node_data_field_is_popular.vid WHERE (node.type in ('article_thisweekend')) AND (UPPER(node_data_field_is_popular.field_is_popular_value)
= UPPER('yes'));");
foreach($res as $reco){
print ($reco->nid);
}
?>
But I am not getting any results.
What I am missing?
Thanks
Matt V. has good advice in that you should try to separate the view templates from the sql query logic.
For this specific example though, you need to use db_fetch_object since $res just contains the
database query result resource
Instead of
foreach($res as $reco){
print ($reco->nid);
}
Do
while ($reco = db_fetch_object($res)){
print ($reco->nid);
}
It's generally best to avoid putting queries directly in your template files. It's best to separate logic and presentation.
Instead, use a module to generate the content you need and pass that along to the theme layer. In this case, if you're already using the Views module to generate the query, let Views run it for you and pass off the data to a page or block display.
Otherwise, to debug the query, try running the query independent of the code, through something like phpMyAdmin or "drush sqlq".

How can you remove the XML schema datattype from sparql query?

Im running a sparql query on a file that contains
<User rdf:about="#RJ">
<hasName rdf:datatype="http://www.w3.org/2001/XMLSchema#string">RJ</hasName>
</User>
I want to return only the name i.e. 'RJ' but when i enter my query
SELECT ?name
FROM <example.com>
WHERE {
assign:RJ assign:hasName ?name .
}
where assign is the correct namespace i return this :
"RJ" ^^<http://www.w3.org/2001/XMLSchema#string>
does anyone have any advice on how to remove the xml schema type for a sparql noob?
thanks in advance
Whether you can do this depends on the SPARQL implementation you are using. Under SPARQL 1.0 this isn't possible, however with SPARQL 1.1 which is now widely supported by most implementations having become a W3C recommendation in March 2013 you can use Project Expressions as follows:
SELECT (STR(?name) AS ?StringName)
FROM <example.com>
WHERE {
assign:RJ assign:hasName ?name
}
Basically a project expression allows you to use any valid SPARQL expression which you could use elsewhere to calculate a new value based on the variables which are previously bound.

Resources