How to query a gemesa-accumulo feature using command line - cql

I ingested data in geomesa accumulo using sfts and converters, Data was ingested successfully and i can visualise the same data using geoserver plugin. I want to filter feature data using command line but however not able to find any commands to do so. Please correct me if am wrong but i want to query feature data set just like done RDMS or so.

The GeoMesa command line tools support querying via the 'export' command. This command uses CQL (which is the same query language that GeoServer supports).
Check out these links for more about the GeoMesa export command.
http://www.geomesa.org/documentation/user/accumulo/commandline_tools.html#export
http://www.geomesa.org/documentation/user/accumulo/examples.html#exporting-features
For more about CQL, see the GeoTools (http://docs.geotools.org/latest/userguide/library/cql/cql.html) and GeoServer documentation (http://docs.geoserver.org/stable/en/user/tutorials/cql/cql_tutorial.html, and http://docs.geoserver.org/latest/en/user/filter/ecql_reference.html).

Related

How to process large .kryo files for graph data using TinkerPop/Gremlin

I am new to Apache TinkerPop.
I have done some basic stuff like installing TinkerPop Gremlin console, creating graph .kryo file, loaded it in gremlin console and executed some basic gremlin queries. All good till now.
But i wanted to check how can we process .kryo files which are very much large in size says more than 1000GB. If i create a single .kryo file, loading it in console(or using some code) is not feasible i think.
Is there any way we can deal with graph data which is pretty huge in size?
basically i have some graph based data stored in Amazon Neptune DB, i want to take it out and store it in some files(e.g .kryo) and process later for gremlin queries. Thanks in advance.
Rather than use Kyro which is Java specific, I would recommend using something more language agnostic such as CSV files. If you are using Amazon Neptune you can use the Neptune Export tool to export your data as CSV files.
Documentation
Git repo
Cloud Formation template

How to track changes in Cassandra table using Java

I am just looking for a way to track changes in a table of Cassandra. I don't want to use a trigger. If any changes made I will immediately update my data source.
Do you have any idea how to implement this feature using Java?
Also is it possible to create a plugin for Cassandra? I did not find any good resource to create a plugin for Cassandra.
Thanks.
I believe that what you are looking for is Change Data Capture (CDC)
You can read more on CDC in Apache Cassandra

How can a sparql query be prepared for a blazegraph triple store using python?

I am looking for a way in which the query can be prepared and be fired on the remote server. I know that is is feasible in Stardog and GraphDB using rdf4j in Java. But how can that be done using python for Blazegraph?
I have tried looking at Sparqlwrapper and rdflib. rdflib supports prepared statement but it can only be used for file parsing and I havent find much documentation of the same as a driver as is the case of rdf4j. SparqlWrapper enables the use of remote repository querying but doesnt have examples of prepared statements and I am in need of both.
I have looked at this SPARQL query on the remote remote endpoint RDFLib / Redland
but this seems to be outdated (8 years old)
Requirement is to build a microservice over blazegraph to execute user specific input at run time. In case prepared statements are not being used, string concatenation will make it vulnerable to security threats and boilerplate codes.

Geomesa accumulo CURD data operations using WFS geoserver

I have created geomesa accumulo datastore and can query features using command line. Now i want to perform data operations using Open Geospatial Consortium's (OGC) Web Feature Service (WFS) for creating, modifying and exchanging vector format geographic information. I don't want to create proxy client or deal with thrifts for programatically operating with accumulo storage. Instead what are other techniques to insert and read using filters for geomesa accumulo storage data.
GeoMesa integrates with GeoServer to support such use cases.
Using WFS to read data is a very common use case. To write data to a layer in GeoServer, you'll want check out WFS Transactions (also called WFS-T). I've used both with GeoMesa and GeoServer successfully.
Check out http://www.geomesa.org/documentation/user/architecture.html#geomesa-and-geoserver for background about GeoMesa and GeoServer. This link provides information about setting up GeoMesa's Accumulo DataStore with GeoServer (http://www.geomesa.org/documentation/user/accumulo/install.html#installing-geomesa-accumulo-in-geoserver). General GeoMesa-GeoServer usage is documented here: http://www.geomesa.org/documentation/user/geoserver.html.
For some quick links to GeoServer's WFS details, I'd suggest reading through (http://docs.geoserver.org/latest/en/user/services/wfs/reference.html) and checking out the demos which come with GeoServer (http://docs.geoserver.org/latest/en/user/configuration/demos/index.html#demos).

using bigquery to analyze iis logs

Any preferred way/example to load and analyze IIS logs (in Extended Log File Format) using bigquery? we will also need to auto-partition it. we can get log files periodically
we want to analyze Usage of a particular feature, which can be identified by a particular URL pattern and a conversion funnel of most popular flows that visitors take through the website, to identify where they come in and leave. Visitors can be identified with a unique ID in a cookie (stored in the logs) and pages can be linked with the referer (also stored in the logs).
Thanks in advance
It's easy to load CSV format files into BigQuery. Both CSV and JSON format source data is supported.
I am not an expert in using IIS, but the quickest way to load flat log data into BigQuery is to start with CSV. IIS log format is pretty straightforward t work with, but you might want save a step and export it into CSV. A quick search shows that many people use LogParser (note: I have never used it myself) to convert IIS logs into CSV. Perhaps give this or a similar tool a try.
As for "auto-partioning" your BigQuery dataset tables - BigQuery doesn't do this automatically, but it's fairly easy to create a new table for each batch of IIS logs you export.
Depending on the volume of data you are analysing, you should create a new BigQuery table per day or hour.
Scripting this on the command line is pretty easy when using the BigQuery command line tool. Create a new BigQuery load job, with a new table name based on each timeslice of log data you have.
In other words, your BigQuery tables should look something like this:
mydataset.logs_2012_10_29
mydataset.logs_2012_10_30
mydataset.logs_2012_10_31
etc...
For more information, make sure you read through the BigQuery documentation for importing data.

Resources