How to optimize a Sybase ASE database? - sap-ase

What are the tricks to optimize a sybase database?
What are the does and don'ts?

Your question seems rather broad and open-ended.
For performance tuning guidelines across the entire product, I would probably start with the several performance tuning books that are in the online documentation.
Ongoing performance optimization can often include monitoring by 3rd party products such as Confio's Ignite (I don't work for them, but it is impressive software).

Related

Janus Graph backend cassandra vs Bigtable

I am planning to use Janusgraph for building graph of different uses our team handles and I see that janus graph has option to use BigTable or Cassandra as storage backend. I am looking for any recommendation on which backend is more optimal/performant ( I am mainly talking about gremlin query performance on 2 hop neighbor of a node ) with JanusGraph.
I understand that performance is pretty subjective and varies based on datasize/graph connectivity and use case so best approach will be to try out myself, which I am planning to do. But has anyone else has done similar performance comparison ? Is there any general recommendation about storage backend here ?
You're right in that performance is both:
subjective
depends largely on data size
I can tell you that I have done this exercise as well. To that end, I think it's important to share this comparison from DB-Engines.com.
In terms of performance, the biggest thing I'd be looking at is how each handles consistency. As a general rule, databases which enforce stronger levels of consistency typically have to sacrifice performance.
BigTable == strong-consistent
Cassandra == eventually consistent
Other factors worth considering, are the fact that BigTable limits you to Google Cloud (GCP). And if you don't want to lose performance over the network, you'll also need to pay for more (Janus) instances on GCP for data locality.
In terms of raw DB-Engine "score," Cassandra is currently at 114.112, while BigTable is at a paltry 3.582. These scores will change month-to-month, but in general this signifies that Cassandra has a much stronger community around it. Similarly, Cassandra has 18182 questions on this site, while BigTable only has 449. Bottom line, is that it'll be much easier to get support and answers to questions.
Just based on the underlying strength of the community, Cassandra is the better option here.
Having supported JanusGraph on Cassandra for the last few years, I can tell you that overall it's been solid. The difficulties tend to come into play with bulk data loading. But outside of that, things seem to run pretty well.

DataStax Cassandra seems expensive, is there a best practice configuration to use Apache Cassandra in Production?

DataStax seems expensive. Is there a best practice configuration that is available to use Apache Cassandra in production? I am trying to setup Cassandra on EC2.
Thanks
Instead of giving you a commercial for some other product, let me give you some practical advice when choosing to go with OSS vs Commerical licensed products.
You have two things to spend when using any technology. Time or money. Ultimately time is money, but for the sake of this let's say they are different. By your question, you have more time so let's focus on that.
Spend the time to learn the fundamentals. The term black magic is FUD. Some of the world's largest workloads are running on Cassandra. You can do it too.
Seek out peers and learn from those who have been successful. There are organizations that have been running Cassandra in prod for years.
Focus on a single use case/project. Nothing worse than trying to replace all of your infrastructure with a new technology when you are learning. Pick one thing and become proficient. Use that experience for the next projects.
You can get some free training at DataStax Academy. http://academy.datastax.com
Learn from peers by watching talks from the community of awesome users.
You can find something in these 135 talks here: https://www.youtube.com/playlist?list=PLm-EPIkBI3YoiA-02vufoEj4CgYvIQgIk
If you need to ask questions. Stack Overflow, the Cassandra mailing list, and DataStax Academy Slack are all good resources.
Using a commercial product or spending the time is up to you, but don't let anyone try to convince you that it's too hard and you should use something else. We are all here to help if we can.
Disclaimer: I'm a ScyllaDB employee.
There are several alternative to operate Cassandra/Scylla like workloads.
Use OpenSource Cassandra, with best practices. Most of them, unfortunately, where created couple of years ago. So you'll need to learn the black magic of tuning JVM and Cassandra loads.
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
There are no "official" AMIs on AWS for recent releases of Cassandra.
Use Scylla OpenSource. It is a drop replacement for Cassandra. Scylla autotunes itself, to minimize the intervention of the operator in the day-to-day operations. Also, Scylla provides opensoure AMIs for EC2 deployment, so, all you need is an AWS account.
Scylla is a C++ implementation of Cassandra, which benefits from the great (and costly) resources on AWS. Thus, offer a better ops/$$ ratio. Scylla highly recommends the usage of I3 instances, you'd be using contemporary CPU technology, excellent I/O (NVMe based) and lots of memory at the fraction of the cost of other EC2 instances.
You can read more about it here:
http://www.scylladb.com/2017/05/10/faster-and-better-what-to-expect-running-scylla-on-aws-i3-instances/
ScyllaDB is committed to provide opensource, optimized AMI versions.
Buy enterprise licenses from DataStax or Scylla.
Hire consultants to help you install a Cassandra setup.
Companies like "the last pickle" or Pythian can help you in that regard.
Use DBaaS offerings from the following companies:
Scylla:
IBM Compose: https://www.compose.com/databases/scylladb
Joyent Triton:https://www.joyent.com/blog/free-trial-managed-scylladb-beta-on-triton
Scylla and Cassandra
Instaclustr: https://www.instaclustr.com/
Hope this helps.

Performance testing in Cassandra

I'm currently doing some improvement to Apache cassandra 1.2.8, and I want to do some performance testing on the data base. What is the best way of doing performance testing on this kind of NO-SQL data base? are there any tools or standards which we can use for performance testings?
Check out YCSB. While not a standard it has been used by quite a few products including Cassandra.

CouchDB in Production

I have been using CouchDB on some prototype applications and it has been brilliant, very easy to use and extremely quick. I was wondering if anyone has been using it in production and have any views on it's reliability, performance suitability for operational management etc ?? I am considering using it to support a service layer and would make use of its replication functionality.
Any comments/experiences would be most welcome.
I've used CouchDB for a few small in-house applications - it's been very stable and I've had no serious complaints. Setting that aside, a few small gripes -
1) Databases can be synchronized, but not nodes. That is, if you have four servers and twenty databases, you have to specify each server, and each database to synchronize. A minor gripe, but I prefer less management to more.
2) Since databases are append only, a database with a bunch of activity gets really big really quickly. Compacting fixes this, but isn't exactly fast, especially on big (e.g. 20 gigabytes) database. Scheduling compact for the weekends solved this, but doing that is probably less of an option for high availability applications.
3) Javascript is the de facto view language. What is not well advertised is that since CouchDB is written in Erlang, it also supports Erlang views, which are faster as they are "native". For applications doing a lot of operations in views, Erlang probably makes more sense.
Setting those minor issues aside, I'd wholeheartedly recommend it.
CouchDB ships in Ubuntu and is a fundamental component of the Ubuntu One service.

Cassandra vs Riak [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am looking for an eventually consistent data store and it looks like it may be coming down to Riak or Cassandra. Has anyone got expereinces of a view on this?
As you probably know, they are both architecturally strongly influenced by Dynamo (eventually consistent, no single points of failure, etc). Both also go beyond Dynamo in providing a "richer than pure K/V" data model -- in Cassandra's case, providing a Bigtable-like ColumnFamily mode, in Riak's, a Document-oriented one. I have seen sane people choose both.
I believe points that favor Cassandra include
speed
support for clusters spanning multiple data centers
big names using it (digg, twitter, facebook, webex, ... -- http://n2.nabble.com/Cassandra-users-survey-tp4040068p4040393.html)
Points that favor Riak include
map/reduce support out of the box
/Cassandra dev, fwiw
Riak is used by
Mozilla Foundation
Ask.com sponsored listings
Comcast
Citigroup
Bet365
I think they both pass the test of credible reference customers/users.
Cassandra seems more mature, and is currently doing better in benchmarks. Riak seems easier to add a node to as your cluster grows.
For completeness: A good (probably biased) comparison between the two can be found at http://docs.basho.com/riak/1.3.2/references/appendices/comparisons/Riak-Compared-to-Cassandra/
Use and download are different. Best to get references.
Perhaps a private conversation could be had where Riak references in these companies could be shared? Not sure how to get such with Cassandra, but there is a community of companies that support Cassandra that seem like a good place to start. As these probably have community participants in Cassandra development, it may be a REALLY reasonable place to start.
I would like to hear Riak's answer to recent and large deployments where customers are happy.
I also would like to see the roadmap for each product. Cassandra is a bit easier to track (http://wiki.apache.org/cassandra/) than Riak in my view as Cassandra's wiki discusses limitations and things that are probably going to change going forward, but neither outline futures well. I could understand that of an open source community ... perhaps ... but I cannot for a product for which I must pay.
I also would suggest research of Cloudant, which has what appears to be a very nice layering of capabilities. It also looks like it is bringing to bear the capabilities elsewhere in Apache land. CouchDB is the Apache platform on which Cloudant is based. BUT the indexing with Lucene seems but the tip of the iceberg when it comes to where Cloudant could go. Creating and managing an index is a very systematic process, a kind of data pipeline, that could be scripted using other Apache community assets. AND capabilities like NLP also could be added through Lucene indirectly, or maybe directly into what is persisted.
It would be nice to see a proposed Cloudant roadmap, especially since the team could mine the riches of the Apache community and integrate such into Cloudant. Such probably exists as there is an operational component to the Cloudant revenue model that will require it, if for no other reason.
Another area of interest ... Cloudant's pricing model ... it is clear their revenue model is not based on software, but around service. That is quite attractive, and it seems consistent with the ecosystem surrounding Cassandra too. I don't know if the Basho folks have won over enough of the nosql community as yet ... don't see such from any buzz around their web site or product.
I like this Cloudant web page (https://cloudant.com/the-data-layer/). I was surprised to see the embedded Erlang capability ... I did not know CouchDB was written in Erlang as this seems unusual to me in the Apache community (my ignorance); CouchDB appears to be older than other nosql products I know (now) to be written in Erlang. Whatever their strategy, they at least count Amazon EC2 and Microsoft Azure as hosting partners, indicating an appreciation of Microsoft and !Microsoft worlds - all very important if properly recognizing the middleware value potential (beyond cache or hash table applications) that these types of data stores could have.
Finally, while I don't know the board well, Andy Palmer's guidance looks like it will be valuable. He can bring some guidance vis-a-vis structured data (through VoltDB) to a world that rightly or wrongly may be unfairly branded as KVP hash tables of unstructured data. The need for structure and ecosystem surrounding nosql "databases" is being recognized ... witness Google's efforts with Spanner ... KVP/little structure/need for search-ability motivated Google's investment in the Spanner space. While we all may not need something like Spanner, we probably do need an improving and robust "enterprise" management and interoperability capability in these nosql databases to make it reasonable to incorporate them into modern cloud architectures. The needed structure can come from ease of interoperability and functional richness. It can also come from new capabilities that support conversion of unstructured data to structured data (e.g. indexes, use of NLP to create structured and parsed renderings of things inside of a KVP blob, and plenty of other things that, if put into a roadmap and published, could entice and grow a user base). Cloudant looks like it has a good chance of success ... I will take a closer look at it ...
And look what I found about CouchDB ...
CouchDB comes with a suite of features, such as on-the-fly document transformation and real-time change notifications, that makes web app development a breeze. It even comes with an easy to use web administration console. You guessed it, served up directly out of CouchDB! We care a lot about distributed scaling. CouchDB is highly available and partition tolerant, but is also eventually consistent. And we care a lot about your data. CouchDB has a fault-tolerant storage engine that puts the safety of your data first.

Resources