Saving non-tempogeospatial data in GeoMesa - accumulo

Given that I use Geomesa to store tempogeospatial data in Geomesa, I also want to store non-tempogeospatial data. What Geomesa API do you recommend for this: Geomesa Native API or Geomesa DataStore API? Even better (I think) can I access native Accumulo API from Geomesa to store my data without geospatial and temporal indices?

The native API is just a simplified wrapper around the data store API, that hides some of the complexity of working with geotools. If the native API covers your use case, then it may be easier to work with. Otherwise, use the data store API.
GeoMesa support schemas that do not have a date or geometry; however the project's primary goal is spatial indexing, and it is probably not the best solution for non-spatial data. Additionally, some features may not work without a geometry. The native API, in particular, requires both a geometry and date attribute.

A compulsory POINT field is required. If the field is not present in data store it directly in accumulo.

Related

is redisJSON better than plain redis when keeping data for boardgame session data?

Just loaded up a redis server for my backend with ioredis.
I'm learning that if i want to store data in json spec, i gotta use the redisJSON module instead. Since hashes are only string typed and they are flat. However, if im only storing one object per user instance, containing less than 10 fields that are typed string/num or array.. is it better to just use without redisJSON? On one hand, redisJSON can let me query an object on one query. On the other, i can just store multiple datatypes and query between those sets/hash with a consistent naming convention.
Does anyone know whats the better usage or pitfalls with either approach?
the backend serves a websocket for a multiplayer boardgame.
The answer is it depends and it requires several trade-offs to be made for each project
Performance: RedisJSON uses a tree structure for storing all elements in a document.
Comparing to a string: the advantage is that updating sub-elements of a document will be faster than manipulating a string containing a serialised JSON object. But retrieving (reassembling) and writing the entire document will be more expensive compared to Strings. Read more here.
Comparing to Hash: when manipulating a flat document (1 level deep), RedisJSON and HSET performance are comparable.
Maintainability: using several native data types in Redis to represent your object can be really performing, but the code will be more complex to maintain. There can be additional migration/refactoring work when the structure of the document is altered.
Querying: RediSearch has support for indexing and querying the content RedisJSON documents. This is, of course, if your use case requires secondary indexing and querying documents other than with their key. You can still build your own secondary indexing with Redis data structures, but this is also a trade-off in maintainability
disclaimer: I work at Redis, creator and maintainer of RediSearch and RedisJSON

Are the new fields on Processes and Tasks in Viewflow 1.6.0 for library users, or for internal use only

Viewflow 1.6.0 introduces new fields ("data" is a JSON field, and "artifact" support for a generic foreign key). They are present on both Processes and Tasks.
Are these intended to be available to library users, or are they Viewflow internal-use-only? I did not see anything in the docs or the github issues list to clarify the matter, so a pointer would be appreciated if I missed it.
Yep, it's for library users, that allows using proxy models instead of real tables for keeping process-only data
Data field is the JSON. So it could be used with jsonstore field - https://github.com/viewflow/jsonstore that makes JSON data exposed as a real Django field. So it could be used with ModelForms as usual
Ex: https://github.com/viewflow/viewflow/blob/master/demo/helloworld/models.py#L6
Articact allow to link process and your data models, without creating a separate table for that.
All of those allow avoiding joins to build all tasks from different flows for a user.

Hazelcast Portable serialization

I want to use Portable serialization for objects stored in IMap to achieve:
fast indexing during insertion (without deserializing objects and
reflection)
class evolution (versioning)
Is it possible to store my classes without implementing Portableinterface?
Is it possible to store 3rd party classes like Date or BigDecimal (or with nested structure) which can not implement Portable interface, while still being indexable?
You can achieve fast indexing using Portable, yes. You'll also see benefits when you're querying on non-indexed fields since there'll be no full deserialization. VersionedPortable support versioning as well but
You must implement Portable interface
For types that doesn't supported by portable, you need to convert the data to a supported format, For date Long for example. And you need to code serialization/deserialization for each property & handle versioning yourself.
Portable is backward compatible only for read. If you update the data from an app who has a previous version, then you'll lost the new field updates done previously by an app has higher version of the Portable object.
So depends on your exact requirements, you need to chose the correct serialization format.
If versioning is not so important or you can handle it manually, but query performance is, then yes Portable make sense. But if you're planning to use versioning heavily, I would suggest using a backward/forward compatible serialization format like Google Protocol Buffers.
You can check this example to get an idea: https://github.com/gokhanoner/data-versioning-protobuf

Can CouchDB do this?

I evaluating CouchDB & I'm wondering whether it's possible to achieve the following functionality.
I'm planning to develop a web application and the app should allow a 'parent' table and derivatives of this table. The parent table will contains all the fields (master table) and the user will selectively choose fields, which should be saved as separate tables.
My queries are as follows:
Is it possible to save different versions of the same table using CouchDB?
Is there an alternative to creating child tables (and clutter the database)?
I'm new to NoSQL databases and am evaluating CouchDB because it supports JSON out of the box and this format seems to fit the application very well.
If there are alternatives to NOT save the derivatives as separate tables, the better will the application be. Any ideas how I could achieve this?
Thanks in advance.
CouchDB is a document oriented database which means you cannot talk in terms of tables. There are only documents. The _rev (or revision ID) describes a version of a document.
In CouchDB, there are 2 ways to achieve relationships.
Use separate documents
Use an embedded array
If you do not prefer to clutter your database, you can choose to use option (2) by using an embedded array.
This gives you the ability to have cascade delete functionality as well for free.

Blazegraph Tinkerpop 3 Indexing

I am trying to learn about Blazegraph. At the moment I am puzzled how I can optimise simple lookups.
Suppose all my vertices have a property id, which is unique. This property is set by the user. Is there any way to speed up finding a vertex of a particular id while still sticking to the Tinkerpop APIs?
Is the search API defined here the only way?
My previous experience is in TitanDB and in Titan's case it's possible to define an index which the Tinkerpop APIs integrate with flawlessly. Is there any way to achieve the same results in Blazegraph without using the Search API?
Whether a mid-traversal V() uses an index or not, depends on a)
whether suitable index exists and b) if the particular graph system
provider implemented this functionality.
Gremlin (Tinkerpop) does not specify how to set indexes although the documentation presents things like the following
graph.createIndex("username",Vertex.class)
But may be reserved for the ThinkerGraph implementation, as a matter of fact it says
Each graph system will have different mechanism by which indices and
schemas are defined. TinkerPop3 does not require any conformance in
this area. In TinkerGraph, the only definitions are around indices.
With other graph systems, property value types, indices, edge labels,
etc. may be required to be defined a priori to adding data to the
graph.
There is an example for Neo4J
TinkerPop3 does not provide method interfaces for defining
schemas/indices for the underlying graph system. Thus, in order to
create indices, it is important to call the Neo4j API directly.
But the code is very specific for that plugin
graph.cypher("CREATE INDEX ON :person(name)")
Note that for BlazeGraph the search uses a built in full-text index

Resources