Building Geospatial Index when working with JENA FUSEKI

Building Geospatial Index when working with JENA FUSEKI - geospatial

I would like to use the nearby geospatial function which is described as supported here through JENA FUSEKI - https://jena.apache.org/documentation/query/spatial-query.html
I need to build the geospatial Index for the query to work. The instructions are as follows (taken from above link):
Build the TDB dataset:
java -cp $FUSEKI_HOME/fuseki-server.jar tdb.tdbloader --tdb=assembler_file data_file
using the copy of TDB included with Fuseki. Alternatively, use one of the TDB utilities tdbloader or tdbloader2:
$JENA_HOME/bin/tdbloader --loc=directory data_file
then build the spatial index with the jena.spatialindexer:
java -cp jena-spatial.jar jena.spatialindexer --desc=assembler_file
Assuming I knew which file is the assembler file in my FUSEKI folder (I don't), I search for jena-spatial.jar in my latest jena download. Having found it is not there, I search for it and find a copy of the jar here - https://jar-download.com/?detail_search=g%3A%22org.apache.jena%22+AND+a%3A%22jena-spatial%22&search_type=av&a=jena-spatial&p=1
I try running it, but I get the error "Could not find or load main class jena.spatialindexer". I do searchers for jena.spatialindexer and I find a match (cannot post here as at link post limit).
At this point I am wondering would it be possible to make this just a little bit more complicated? You know, I obviously have all the time in the world to search through google trying to figure out these cryptic clues.
In short, if anyone out there has done this before, please could you point out where I am going wrong?
Kindest regards,
Kris.

Just in case it might help, find below my configuration
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix : <#> .
#prefix spatial: <http://jena.apache.org/spatial#> .
# TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
# Spatial
[] ja:loadClass "org.apache.jena.query.spatial.SpatialQuery" .
spatial:SpatialDataset rdfs:subClassOf ja:RDFDataset .
#spatial:SpatialIndexSolr rdfs:subClassOf spatial:SpatialIndex .
spatial:SpatialIndexLucene rdfs:subClassOf spatial:SpatialIndex .
## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the spatial dataset.
:spatial_dataset rdf:type spatial:SpatialDataset ;
spatial:dataset <#tdb_dataset_readwrite> ;
##spatial:index <#indexSolr> ;
spatial:index <#indexLucene> ;
.
<#tdb_dataset_readwrite> rdf:type tdb:DatasetTDB ;
tdb:location "/myfolder" ;
## # Query timeout on this dataset (milliseconds)
## ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1000" ] ;
## # Default graph for query is the (read-only) union of all named graphs.
tdb:unionDefaultGraph true ;
.
<#indexLucene> a spatial:SpatialIndexLucene ;
#spatial:directory <file:Lucene> ;
## spatial:directory "mem" ;
spatial:directory <file:/myfolder/spatial> ;
spatial:definition <#definition> ;
.
<#definition> a spatial:EntityDefinition ;
spatial:entityField "uri" ;
spatial:geoField "geo" ;
# custom geo predicates for 1) Latitude/Longitude Format
spatial:hasSpatialPredicatePairs (
[ spatial:latitude :latitude_1 ; spatial:longitude :longitude_1 ]
[ spatial:latitude :latitude_2 ; spatial:longitude :longitude_2 ]
) ;
# custom geo predicates for 2) Well Known Text (WKT) Literal
spatial:hasWKTPredicates (:wkt_1 :wkt_2) ;
# custom SpatialContextFactory for 2) Well Known Text (WKT) Literal
spatial:spatialContextFactory
"org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
.
# "com.spatial4j.core.context.jts.JtsSpatialContextFactory"
<#service_tdb1> rdf:type fuseki:Service ;
rdfs:label "TDB Service" ;
fuseki:name "tdb_spatial" ;
fuseki:serviceQuery "query" ;
fuseki:serviceQuery "sparql" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ;
fuseki:serviceReadWriteGraphStore "data" ;
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset :spatial_dataset ;
as you can see I have changed the class name to org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory. Besides when running fuseki I have to include another jar file in the classpath as otherwise I have issues with
EntityDefinitionAssembler WARN Custom SpatialContextFactory lib is not ready in classpath:com/vividsolutions/jts/geom/CoordinateSequenceFactory
Then my command line looks like:
java -cp "fuseki-server.jar:lib/jts-1.13.jar" org.apache.jena.fuseki.cmd.FusekiCmd -debug
I have downloaded jts-1.13.jar from here
Besides I would like to thank to Kai for pointing me in the right direction.
Note: I still have to fully understand the fields under spatial:EntityDefinition. I will try to edit with more information.

Related

How to compare two parts of a line multiple times in multiple files with specific extension?

NOTE BEFORE READING: The following question is described very precisely and that is the reason for the length of a question. If you want to understand the problem, it's better to read the entire thing. Many thanks for all the answers!
I am working on a bash script (.sh file) which will check certain values in every file of a directory. Bash script will be executed in a pre-commit (pre-commit is not a part of the question).
There is a directory that contains multiple .c files in multiple subdirectories. I want to check a part of two lines which are NOT in every .c file but only in some of them. The structure of a file that contains the useful information is as following:
/*
## SYMBOL = some_symbol1
## A2L_TYPE = PARAMETER
.
.
.
#! DEFAULT = some_value1
## END
*/
some_symbol1 = some_value1
/*
## SYMBOL = some_symbol2
## A2L_TYPE = PARAMETER
.
.
.
#! DEFAULT = some_value2
## END
*/
some_symbol2 = some_value2
This kind of structure is automatically generated by another script.
I want to check if some_value1 (in comment) is equal to some_value1 (in variable).
There are hundreds of these variable in each .c file (not necessarily in each .c file).
The main functionality of a script should be:
Check some_value1 in comment and variable and throw an error if they are not the same. Script has to go through EVERY .c file in a directory (bash is in root) and ALL subdirectories to find previously mentioned structure.
Value of variable can be something as 0.06F, where in comment, there is 0.06 (compare only the numbers)
Value of variable can also be an array: { 0.0F, 0.45F, 0.3F } where in the comment, there is [ 0.0, 0.45, 0.3 ] (without F and difference in braces)
To summarize:
I want to build a check script that compares some_value1 (in comment) and some_value1 (in variable) and throw an error if they don't match
Useful information is not in EVERY .c file but only in some of them (don't know which)
Values after #! DEFAULT is a comment where the value of variable is a number (maybe this is not that important?)
between A2L_TYPE and DEFAULT, there can be desired number of unimportant stuff. (still a comment)
What I tried so far is for loop through every .c file and a nested for loop to read every line in each .c file. What I wanted to implement was a grep command inside for loop to check each line if there is a #! DEFAULT pattern and save it to the variable.
Latest code that I tried:
!/bin/bash
shopt -s globstar
for d in */**/*.c
do
while IFS="" read -r p || [ -n "$p" ]
do
grep -P "#! DEFAULT" $d
done < $d
done
This is currently not working because it gives an error that certain grep targets are directories
If any has any questions, I will try to explain it better.

# search for files with extension ".c"
# execute awk on any matches, using '= ' as field separator
find . -type f -name '*.c' -exec awk -F'=[[:space:]]*' '
# check if first three lines match template
( NR==1 && /^\/\*/ ) ||
( NR==2 && /^## SYMBOL = / ) ||
( NR==3 && /^## A2L_TYPE = PARAMETER/ ) { ok++ }
# template mismatch - skip this file
( NR==4 && ok!=3 ) {
printf "%s : ignored\n", FILENAME
nextfile
}
# store first occurrence of some_value1
# note line number where second occurrence expected
/^#! DEFAULT =/ { v[1]=v1=$2; n=NR+3 }
# test second occurrence
NR==n {
v[2]=v2=$2;
# prune everything except numbers and array delimiters
for (s in v) gsub(/[^0-9.,]/,"",v[s]);
# output result
# match exactly or only number list
printf "%s #(%d,%d) : ", FILENAME,n-3,n
if (v1==v2 || v[1]==v[2])
printf "match (%s)==(%s)\n", v1,v2
else
printf "mismatch (%s)!=(%s)\n", v1,v2
# no need to check rest of this file
# elide to check multiple values per file
nextfile
}
' {} +

Importing external instances as owl file

I'm loading the ontology of https://users.ugent.be/~hvhaele/stad.gent.mini.ttl and import the four instances of https://users.ugent.be/~hvhaele/nwd.owl into Protégé.
the four instances
Then, using this shacl code, I should have no violations:
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix dcat: <http://www.w3.org/ns/dcat#> .
#prefix dct: <http://purl.org/dc/terms/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix nwd: <https://users.ugent.be/~hvhaele/nwd.owl#> .
[ rdf:type owl:Ontology ;
owl:imports <https://users.ugent.be/~hvhaele/nwd.owl>
] .
# Validation item 1: Not an opendefinition.org license type
ex:OpenLicenseShape a sh:NodeShape ;
sh:targetClass dcat:Catalog , dcat:Distribution ;
sh:property [
sh:path dct:license ;
sh:class nwd:OpenLicense ;
sh:severity sh:Warning ;
sh:message "Not an opendefinition.org license type."
] .
But instead, I get three violations, being:
https://data.stad.gent/api/v2/catalog/datasets/api-luftdateninfo-csv http://purl.org/dc/terms/license http://opendatacommons.org/licenses/odbl/
https://data.stad.gent/api/v2/catalog/datasets/api-luftdateninfo-geojson http://purl.org/dc/terms/license http://www.opendatacommons.org/licenses/odbl/1.0/
https://data.stad.gent/api/v2/catalog/datasets/api-luftdateninfo-shp http://purl.org/dc/terms/license http://opendatacommons.org/licenses/odbl/
What am I doing wrong. Any help appreciated!

Strange behavior using read for validate string (bash script)

I are creating a .sh using bash for validate the api sub folders versions
The objective is validate this strings into APIS_BUILD var and find all .proto files into ./proto folder to compile into protobuffer Go file
# define subfolder apis to build
APIS_BUILD=(
prototests/v1/
prototests2/v2
testfolder
)
# the "testfolder" are a invalid folder
Test cases:
prototestes/v1 # valid
prototestes/v1/cobranca # valid
prototestes/v1/cobrnaca/faturamento # valid
outrapastacomarquivosproto/v1 # valid
prototests # invalid
/prototests # invalid
Then, I created this script for validate the APIS_BUILD string array
#!/usr/bin/env bash
# text color
RED='\033[0;31m' # RED
BLUE='\033[0;34m' # Blue
NC='\033[0m' # No Color
# Underline color
UCyan='\033[4;36m' # Cyan
# define subfolder apis to build
APIS_BUILD=(
prototests/v1
cobrancas/v1
)
DST_DIR="." # define the directory to store the build-in protofiles
SRC_DIR="./proto" # define the proto files folder
# Compile proto file
# $1 = Filename to compile
function compile() {
protoc --go_out=$DST_DIR --proto_path=proto --go_opt=M$1=services \
--go_opt=paths=import --go-grpc_out=. \
$1
}
# Validate api_build's
function validateApiBuilds() {
for t in ${APIS_BUILD[#]}; do
IFS="/"
read -a SUBSTR <<<"$t"
if [ ${#SUBSTR[#]} -lt 2 ]; then
printf "${RED}The API_BUILD value ${UCyan}\"${t}\"${RED} are declare wrong, please declare [api_folder]/[version_folder] (example: prototest/v1)${NC}\n" 1>&2
exit 1
fi
done
}
validateApiBuilds
for filename in $(find $SRC_DIR -name '*.proto'); do
[ -e "$filename" ] || continue
echo $filename
done
The subfolder:
But I getting a strange behavior:
If run the .sh file with the validateApiBuilds function the return for $filename is always .
If run the .sh file without the validateApiBuilds function the return for $filename are getting the testservice.proto file
Pictures:
With validateApiBuilds function:
Without validateApiBuilds function:
All the variables:
# define subfolder apis to build
APIS_BUILD=(
prototests/v1
cobrancas/v1
)
DST_DIR="." # define the directory to store the build-in protofiles
SRC_DIR="./proto" # define the proto files folder
Bash version:
$ bash --version
$ GNU bash, versão 4.4.19(1)-release (x86_64-pc-linux-gnu)
Obs.: I changed the validateApiBuilds function to use a regex validation for strings into API_BUILDS variable. But I really wanted to know the reason for this behavior.
edit 2: The make-proto.config file
# define subfolder apis to build
APIS_BUILD=(
prototests/v1
cobrancas/v1
)
DST_DIR="." # define the directory to store the build-in protofiles
SRC_DIR="./proto" # define the proto files folder

Use find better
for filename in $(anything) is always an antipattern -- it splits values on characters in IFS, and then expands each result as a glob. To make find emit completely unambiguous strings, use -print0:
while IFS= read -r -d '' filename; do
[ -e "$filename" ] || continue
echo "$filename"
done < <(find "$SRC_DIR" -name '*.proto' -print0)
Don't change IFS unnecessarily
Change your code to make the assignment to IFS be on the same line as the read, which will make IFS only be modified for that one command.
That is to say, instead of:
IFS=/
read -a SUBSTR <<<"$t"
...you should write:
IFS=/ read -a SUBSTR <<<"$t"

String behind two variables in directory path laravel 5

I want a "/" behind "vandiepen" and "test.txt. Now, Laravel gives an error, because it's not a good path to the file.
The file "C:\xampp\htdocs\systeembeheer\storage/download/vandiepentest.txt" does not exist
I tried to put the "/" behind the $folder and the $id variable.
$file = storage_path(). "/download/".$folder "/" .$id;
When I do that, Laravel gives an error:
syntax error, unexpected '"/"' (T_CONSTANT_ENCAPSED_STRING)

The problem is that you missed . (concatenation operator) after $folder.
It should be:
$file = storage_path() . "/download/" . $folder. "/" . $id;
Also you can use DIRECTORY_SEPARATOR.
$file = storage_path() . DIRECTORY_SEPARATOR . "download" . DIRECTORY_SEPARATOR . $folder . DIRECTORY_SEPARATOR . $id;

Bash script archiving files according to number

I am currently writing a script that mounts a samba share, rsyncs the data to a local machine and archives into a directory structure (say /home/archive/). Currently when new pdfs are added, archiving done manually which seems like inefficient use of time
The files have the following structure
ABC140003.pdf
ABC140124.pdf
.
.
ABC144201.pdf
.
ABC146012.pdf
/home/archive/ has several directories 2010/, 2011/, 2012, 2013 etc
Basically, I need to break up the number to find the correct subdirectory to copy the file. First I extract the number
study_number=`echo $file | sed 's/[^0-9]//g'`
Then the year
year=20`echo $study_number | cut -c 1-2`
All the above pdf files belong in the subdirectory of 2014. Within 2014 or any other year directories there are the following subdirectories 2014/Blue/, /2014/Red/and/2014/Green/`. This corresponds to the 3rd integer in the number Blue(0), Red(4) and Green(6).
I use cases here to find what I have called study type
type_int=`echo $study_number | cut -c 3`
case "$type_int" in
0)
type_string="Blue"
;;
4) type_string="Red"
;;
6) type_string="Green"
;;
*) echo "$date: $file has unknown study type. Do not know where to place it" >> $logfile
continue
;;
esac
I now know the following files go in the following directories
ABC140003.pdf -> /home/archive/2014/Blue/
ABC140124.pdf -> /home/archive/2014/Blue/
.
.
ABC144201.pdf -> /home/archive/2014/Red/
.
ABC146012.pdf -> /home/archive/2014/Green/
I'd be happy if this was the end of the directory structure. However, there is another layer of subdirectories have been introduced so that no directory has more than 100 pdf files (Not my call).
For example /home/archive/2014/Blue/ has the following directories:
140001-0100/ 140101-0200/ 140201-0300/ 140301-0400/ 140401-0500/ 140501-0600/
etc
I now need to come up some logic such that the following files go to the following directories
ABC140003.pdf -> /home/archive/2014/Blue/140001-0100
ABC140124.pdf -> /home/archive/2014/Blue/140100-0124
.
.
ABC144201.pdf -> /home/archive/2014/Red/144200-4300
.
ABC146012.pdf -> /home/archive/2014/Green/146000-6100
I am stumped on how to logically determine that study ABC146012 should go in 146000-6100 in an elegant manner without resorting to multiple if statements for each of Red/ Blue/ and Green/

Here is a simplified version that needs some work but you get the idea (for a nice final solution, see #glenn jackman's solution):
Declare an associative array for the colors
$ declare -A colors
$ colors[0]=Blue
$ colors[4]=Red
$ colors[6]=Green
Then extract the needed information
$ study_number=$(sed 's/[^0-9]//g' <<< ABC140124.pdf);
$ year=${study_number:0:2};
$ type=${study_number:2:1};
$ color=${colors[$type]};
$ from="${study_number:0:$((${#study_number}-2))}01"
$ to="$((${study_number:0:$((${#study_number}-2))}+1))00"
and that gives:
$ echo /home/archive/$year/$color/$from-$to
/home/archive/14/Blue/140101-140200
(I assumed you wanted your intervals to be consistently numbered 'x01-(x+1)00')
You can create a function to simplify the process
build_dir() {
study_number=$(sed 's/[^0-9]//g' <<< $1);
year=${study_number:0:2};
type=${study_number:2:1};
color=${colors[$type]};
from="${study_number:0:$((${#study_number}-2))}01"
to="$((${study_number:0:$((${#study_number}-2))}+1))00"
echo "/home/archive/$year/$color/$from-$to"
}
It needs a bit of more defensive programming-related lines of code, but it can be used like this:
$ build_dir ABC146012.pdf
/home/archive/14/Green/146001-146100

colors=([0]=Blue [4]=Red [6]=Green)
get_destination() {
if [[ $1 =~ ([0-9][0-9])([0-9])([0-9]) ]]; then
printf "/home/archive/20%s/%s/%s%s%d01-%s%d00" \
${BASH_REMATCH[1]} \
${colors[${BASH_REMATCH[2]}]} \
${BASH_REMATCH[1]} \
${BASH_REMATCH[2]} \
${BASH_REMATCH[3]} \
${BASH_REMATCH[2]} \
$(( 1 + ${BASH_REMATCH[3]} ))
fi
}
for file in ABC140003.pdf ABC140124.pdf ABC144201.pdf ABC146012.pdf; do
echo "$file -> $(get_destination $file)"
done
ABC140003.pdf -> /home/archive/2014/Blue/140001-0100
ABC140124.pdf -> /home/archive/2014/Blue/140101-0200
ABC144201.pdf -> /home/archive/2014/Red/144201-4300
ABC146012.pdf -> /home/archive/2014/Green/146001-6100

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Building Geospatial Index when working with JENA FUSEKI - geospatial

Related

How to compare two parts of a line multiple times in multiple files with specific extension?

Importing external instances as owl file

Strange behavior using read for validate string (bash script)

String behind two variables in directory path laravel 5

Bash script archiving files according to number

Categories

Resources