Apache OFBiz supposedly integrates with Apache Cassandra databases. But does it support the eventual consistency feature?
If yes can anyone point me in the direction of some documentation or some written content explaining how?
If no - does OFBiz integrate and support any other database with eventual consistency?
Thanks in advance,
Jakob
Even if the integration of OFBiz with a Cassandra database has been mentioned in a couple of occasions in the OFBiz mailing lists, I am not aware of any setup of OFBiz that can work properly without a relational database.
Even if there are JDBC drivers for Cassandra, I doubt Cassandra can provide all the relational database features required by OFBiz.
The integration of OFBiz with Cassandra or any other big data tool is achievable and it is done by pairing the relational database with another ad-hoc database for specific purposes.
Unfortunately I am not aware of any effort to implement OFBiz on a database that supports the Eventual consistency model.
Related
How different and efficient AlibabaTable Store is When compared with Apache Cassandra? I understand both are NoSQL Database. Can anyone please elaborate where and when Alibaba Table Store is preferred instead of Apache Cassandra.
You can think of Alibaba Cloud Table store as the Apache Cassandra because Table store checks all the requirements of Cassandra
The next thing when we talk about the benefits of Table Store compared to Cassandra, you need not worry about the below things when we use Table Store:
Scalability
multi-datacenter replication
Distributed
MapReduce support
Fault-tolerant
Well, Alibaba Cloud may not be using Cassandra at the backend, there is no mention of that.
All the scenarios where Cassandra is used, you can replace it with Table Store. But again I have not extensively worked with application involved Apache Cassandra.
If you read sample code for filtering, you will realized the differences with Cassandra. You will need use different data modelling in table store.
I have a use case where i had to analyze real time data using Apache Spark. But i still have a confusion related to choosing data store for my application. The analysis mostly include aggregation, KPI based identity analysis and machine learning tools to predict trends and analysis. Cassandra has good support and large tech companies are already using it in production. But after research i found Druid is faster than Cassandra and is good for OLAP queries but it's results are inconsistent of queries like Count Distinct.
Guys any help related that will be appreciated. Thanks
As your use case is to analyze real time data, I will suggest you to use Druid not Apache Cassandra. For Apache Cassandra, due to its asynchronous master less replication you could have missed the updated data in real time analyzing. On the other hand, Druid is designed for real time analyzing.
Druid Details: http://druid.io/druid.html
Apache Cassandra Details: https://en.wikipedia.org/wiki/Apache_Cassandra
We need to choose between HazelCast Or Cassandra as a distributed data store option. I have worked with cassandra but not with Hazelcast, will like to have a comparative analysis done features like :
Replication
Scalability
Availability
Data Distribution
Performance of reads/writes
Consistency
Will appreciate some help here to help us make the right choice.
The following page and the documents on the page might help on your decision: https://hazelcast.com/use-cases/nosql/apache-cassandra-replacement/
https://db-engines.com/en/system/Cassandra%3BHazelcast
Can some experts give some succinct answers to the differences between Presto and Impala from these perspectives?
Fundamental architecture design
SQL compliance
Real-world latency
Any SPOF or fault-tolerance functionality
Structured and unstructured data use scenario performance
Apache Impala is a query engine for HDFS/Hive systems only.
PrestoDB, as well as the community version Trino, on the other hand are a generic query engine, which support HDFS as just one of many choices. There is a long list of connectors available, Hive/HDFS support is just one of them. This also means that you can query different data source in the same system, at the same time.
I'm currently starting a project that use Cassandra Apache. So I'm interesting in accessing to my database cassandra from Java. For that, I'm using Hector Cassandra. However, I've some doubts about what's the differences between the access via Hector or JDBC Cassandra (specifically this: https://code.google.com/a/apache-extras.org/p/cassandra-jdbc/).
I believe the following (although I not sure if I'm right):
one difference between both could be that are API of different level (I consider that Hector Cassandra is an API of higher-level than JDBC Cassandra)?
in JDBC Cassandra is used CQL for accessing/modifying the database, while Hector Cassandra don't use CQL (only use the methods provided for that).
I'll be thankful if someone can help me and tell me if I'm right/wrong in the previous lines and more differences between both (Hector and JDBC Cassandra).
Thank in advance!
Official Cassandra Java Driver (https://github.com/datastax/java-driver) is probably the best (IMHO, the only) choice for a new project for several reasons:
New features
All other Cassandra clients (Hector, Astyanax, etc) are based on legacy Thrift RPC protocol. RPC "One response per one request" model has severe limitations, for example it doesn't allow processing several requests at the same time in a single connection or streaming large ResultSets.
So, DataStax developed a new protocol that doesn't have RPC limitations. Thrift API won't be getting new features, it's only kept for backward-compatibility. In contrast, Java Driver is actively developed to incorporate the new features of Cassandra 2.0, like conditional updates, batching prepared statements, etc. The overview of new features is here: http://www.datastax.com/dev/blog/cql-in-cassandra-2-0
Convenience
In early Cassandra days (0.7) in our company we have used in-house low-level Thrift client. Later on we have used Hector, Pelops and Astyanax in various projects. I can say that the clients based on Java Driver look the most simple and clean to me.
Performance
We have made some performance testing of Cassandra Java Driver vs other clients. In most scenarios the performance is roughly the same. However, there are certain situations when Cassandra Java Driver significantly outperforms other clients due to its asynchronous nature.
Btw, there's a couple of related questions with excellent answers:
Advantages of using cql over thrift
Cassandra Client Java API's
EDIT: When I wrote this, I wasn't aware that Achilles (https://github.com/doanduyhai/Achilles) mentioned in another answer has CQL implementation that works via Java Driver. For the same of completeness I must say that Achilles' DAO on top of CQL might be (or might became one day) viable alternative to plain CQL via Java Driver.
#mol
Why do you restrict to Hector and cassandra-jdbc if you're starting a new project ?
There are many other interesting choices:
Astyanax as Martin mentioned (Thrift & CQL3)
FireBrand (Thrift via Hector)
Achilles I've just developed (CQL3 & Cassandra 2.0 via Java driver core)
Java Driver Core for plain CQL3
Hector is indeed a higher-level API. Internally it will use Cassandra's Thrift API to execute its functions. It will not convert them to equivalent CQL calls. But its API also provides access to CQL. In this case it will pass the CQL (via Thrift) to Cassandra's APIs for CQL.
CQL in Cassandra is a SQL-like language that works via the Cassandra APIs. So it does not provide any additional capability in the use of Cassandra than the APIs but does make it easier at times to use. If you are considering using Hector I would also look at Astyanax which is a newer take on a high-level Java API to Cassandra.
Since you are starting a new project, it is best to start with CQL as Java native driver:
http://www.datastax.com/documentation/developer/java-driver/1.0/webhelp/index.html#common/drivers/introduction/introArchOverview_c.html
Per DataStax, it is 10-15% faster than Thrift APIs, as it uses Binary Protocol.