Use Case Suggestion for Cassandra - cassandra

I am looking for one complete end to end use case to practice on cassandra, can someone suggest a good one to me or can share examples. I have access to a 6 node cluster.

The only one I know, used with earlier Cassandra versions, is twissandra, a Twitter implementation in Cassandra. I think its abandoned as project but still valid for practice.
I personally think that any exercise simulating a social network behaviour is a good gym with nosql in general.
HTH, Carlo

Related

Janus Graph backend cassandra vs Bigtable

I am planning to use Janusgraph for building graph of different uses our team handles and I see that janus graph has option to use BigTable or Cassandra as storage backend. I am looking for any recommendation on which backend is more optimal/performant ( I am mainly talking about gremlin query performance on 2 hop neighbor of a node ) with JanusGraph.
I understand that performance is pretty subjective and varies based on datasize/graph connectivity and use case so best approach will be to try out myself, which I am planning to do. But has anyone else has done similar performance comparison ? Is there any general recommendation about storage backend here ?
You're right in that performance is both:
subjective
depends largely on data size
I can tell you that I have done this exercise as well. To that end, I think it's important to share this comparison from DB-Engines.com.
In terms of performance, the biggest thing I'd be looking at is how each handles consistency. As a general rule, databases which enforce stronger levels of consistency typically have to sacrifice performance.
BigTable == strong-consistent
Cassandra == eventually consistent
Other factors worth considering, are the fact that BigTable limits you to Google Cloud (GCP). And if you don't want to lose performance over the network, you'll also need to pay for more (Janus) instances on GCP for data locality.
In terms of raw DB-Engine "score," Cassandra is currently at 114.112, while BigTable is at a paltry 3.582. These scores will change month-to-month, but in general this signifies that Cassandra has a much stronger community around it. Similarly, Cassandra has 18182 questions on this site, while BigTable only has 449. Bottom line, is that it'll be much easier to get support and answers to questions.
Just based on the underlying strength of the community, Cassandra is the better option here.
Having supported JanusGraph on Cassandra for the last few years, I can tell you that overall it's been solid. The difficulties tend to come into play with bulk data loading. But outside of that, things seem to run pretty well.

cassandra vs. bigchainDB(and all other distibuted databases)

I want to use blockchain features to provide security for my database.
But the hardest part of this transportation is finding a suitable destination.
I did lots of research but I couldn't find an answer.
My main problem is the difference between Cassandra and BigchianDB and the parameters that I should consider for choosing a suitable decentralized database?
Thank you in advance.

DataStax Cassandra seems expensive, is there a best practice configuration to use Apache Cassandra in Production?

DataStax seems expensive. Is there a best practice configuration that is available to use Apache Cassandra in production? I am trying to setup Cassandra on EC2.
Thanks
Instead of giving you a commercial for some other product, let me give you some practical advice when choosing to go with OSS vs Commerical licensed products.
You have two things to spend when using any technology. Time or money. Ultimately time is money, but for the sake of this let's say they are different. By your question, you have more time so let's focus on that.
Spend the time to learn the fundamentals. The term black magic is FUD. Some of the world's largest workloads are running on Cassandra. You can do it too.
Seek out peers and learn from those who have been successful. There are organizations that have been running Cassandra in prod for years.
Focus on a single use case/project. Nothing worse than trying to replace all of your infrastructure with a new technology when you are learning. Pick one thing and become proficient. Use that experience for the next projects.
You can get some free training at DataStax Academy. http://academy.datastax.com
Learn from peers by watching talks from the community of awesome users.
You can find something in these 135 talks here: https://www.youtube.com/playlist?list=PLm-EPIkBI3YoiA-02vufoEj4CgYvIQgIk
If you need to ask questions. Stack Overflow, the Cassandra mailing list, and DataStax Academy Slack are all good resources.
Using a commercial product or spending the time is up to you, but don't let anyone try to convince you that it's too hard and you should use something else. We are all here to help if we can.
Disclaimer: I'm a ScyllaDB employee.
There are several alternative to operate Cassandra/Scylla like workloads.
Use OpenSource Cassandra, with best practices. Most of them, unfortunately, where created couple of years ago. So you'll need to learn the black magic of tuning JVM and Cassandra loads.
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
There are no "official" AMIs on AWS for recent releases of Cassandra.
Use Scylla OpenSource. It is a drop replacement for Cassandra. Scylla autotunes itself, to minimize the intervention of the operator in the day-to-day operations. Also, Scylla provides opensoure AMIs for EC2 deployment, so, all you need is an AWS account.
Scylla is a C++ implementation of Cassandra, which benefits from the great (and costly) resources on AWS. Thus, offer a better ops/$$ ratio. Scylla highly recommends the usage of I3 instances, you'd be using contemporary CPU technology, excellent I/O (NVMe based) and lots of memory at the fraction of the cost of other EC2 instances.
You can read more about it here:
http://www.scylladb.com/2017/05/10/faster-and-better-what-to-expect-running-scylla-on-aws-i3-instances/
ScyllaDB is committed to provide opensource, optimized AMI versions.
Buy enterprise licenses from DataStax or Scylla.
Hire consultants to help you install a Cassandra setup.
Companies like "the last pickle" or Pythian can help you in that regard.
Use DBaaS offerings from the following companies:
Scylla:
IBM Compose: https://www.compose.com/databases/scylladb
Joyent Triton:https://www.joyent.com/blog/free-trial-managed-scylladb-beta-on-triton
Scylla and Cassandra
Instaclustr: https://www.instaclustr.com/
Hope this helps.

cassandra,solr,lucandra,solandra

I am developing a site using following technologies,
Ruby on Rails,(ruby 1.8.7,rails 2.3.5)
Cassandra 0.6.8,
I want to index the Cassandra Database using Lucandra,
How do I do this?
Is there any RESTful APIs or any web services available for this, so
that I can push the data to index database?
Please share if any ROR example using Lucandra, that really help us to
move forward.
Or Guide me some steps to achieve this.
I am googling for 3 days and I am not getting any examples using
Lucandra in ROR.
Your help will be appreciated in advance
The Solandra project which is replacing Lucandra no longer uses
thrift, only Solr. http://github.com/tjake/Lucandra
This means you can use any of the Solr supported gems like
acts_as_solr
I'm recommending elasticsearch. It has rest api, ruby & rails clients.
https://github.com/angelf/escargot
https://github.com/grantr/rubberband
Elasticsearch is the most advanced free search solution in the world today. It's based on lucene, has High Availability, fault tolerant, partitioned, high performance, scalable, state of art technologhy , open source, more simple than solr... It's success belongs to it's author Shay Banon. He has years of experience as an architect in this field. Solr (and solandra) is nowhere near of it. Simply investigate both, you'll see yourself.
my best
Serdar

Cassandra issues in production

I am looking at using a system with Cassandra as a backend, and I wanted to collect together common problems that people experience in production Cassandra environments with usual workarounds/solutions.
Could anyone tell me of their experiences, as I want to see what potential showstoppers/issues I need to think about.
http://www.riptano.com/docs/0.6/troubleshooting/index

Resources