I am doing a research about some needs in a database and I really liked ArangoDB, The only issue is that I couldn't find any managed services or managed hosts for ArangoDB.
For an example in Amazon AWS services the RDS allows us to easily to scale up, without worrying about the clustering and configuration.
Is there any service that can manage this for me, or should I manage this myself?
You may start an arangodb cluster on AWS with Mesosphere DC/OS. The cluster is fully managed and can be scaled as you go. It is documented here:
https://docs.arangodb.com/3.2/Manual/Deployment/Mesos.html
Related
I have a business case for which we choose to pick Cassandra as NOSQL-DB, But we are stuck with aspect of setting up Cassandra, Any insights over what are the setup options available and what to choose is appreciated.
As of now the options i knew are
1)installing Cassandra on ec2 instance(which i believe is not a production ready option)
2)using a AWS managed Cassandra service
Is there any other ways, Please shed some light on this.
Not sure where you got the information from but they're not correct.
Thousands of companies have Cassandra deployed in production not just on EC2 instances but also GCP, Azure and other public clouds. It is also possible to deploy Cassandra on your own premises, private clouds and even hybrids -- any combination of on-premise + public cloud + private cloud.
If you don't have experience with installing/managing a Cassandra cluster, you can try Astra DB which is a Cassandra-as-a-service running on AWS, GCP and/or Azure. There's a tier that's free forever with no credit card required. It only takes a few clicks to launch a cluster. Cheers!
title describes pretty much what we are trying to accomplish in our organization.
We have a very database intensive application, and our single SQL Server machine is struggling.
We are reading articles about Azure, Docker and Kubernetes but we are afraid of trying these technologies.
Our problem is data replication.
How can we have scalability here? If we have three different SQL Server instances in three different containers, How does data get replicated across them? (meaning, user inserts a new product into a shared library, other user accessing a different node/container should be able to see that product).
Maybe we don't need containers at all and Azure provides another way to scale databases?
We really appreciate any help from you guys.
Regards, Cris.
I would advise against trying to run your databases on K8s. Kubernetes Containers should generally be stateless application, and were not designed for persistent data storage. Azure provides a Database as a Service, which will be able to scale appropriately with your needs(Azure Pricing for Cloud SQL
We once experimented with running our Postgres DB inside of a Kubernetes pod, but I was terrified to change anything. Not worth it, and not what the system was designed for.
If you are really really committed to this path, you can check out MySQL NDB ClusterMySQL for distributed environments. It should be adaptable to the Kubernetes paradigm.
I'm experimenting a little with ACS using the DC/OS orchestrator, and while spinning up a cluster within a single region seems simple enough, I'm not quite sure what the best practice would be for doing deployments across multiple regions.
Azure itself does not seem to support deploying to more than one region right now. With that assumption, I guess my only other option is to create multiple, identical clusters in all the regions I wish to be available, and then use Azure Traffic Manager to route incoming traffic to the nearest available cluster.
While this solution works, it also causes a few issues I'm not 100% sure on how I should work around.
Our deployment pipelines must make sure to deploy to all regions when deploying a new version of a service. If we have a East US and North Europe region, during deployments from our CI tool I have to connect to the Marathon API in both regions to trigger the new deployments. If the deployment fails in one region, and succeeds in the other, I suddenly have a disparity between the two regions.
If i have a service using local persistent volumes deployed, let's say PostgreSQL or ElasticSearch, it needs to have instances in both regions since service discovery will only find services local to the region. That brings up the problem of replication between regions to keep all state in all regions; this seem to require some/a lot of manual configuration to get to work.
Has anyone ever used a setup somewhat like this using Azure Container Service (or really Amazon Container Service, as I assume the same challenges can be found there) and have some pointers on how to approach this?
You have multiple options for spinning up across regions. I would use a custom installation together with terraform for each of them. This here is a great starting point: https://github.com/bernadinm/terraform-dcos
Distributing agents across different regions should be no problem, ensuring that your services will keep running despite failures.
Distributing masters (giving you control over the services during failures) is a little more diffult as it involves distributing a zookeeper quorum across high latency links, so you should be careful in choosing the "distance" between regions.
Have a look at the documentation for more details.
You are correct ACS does not currently support Multi-Region deployments.
Your first issue is specific to Marathon in DC/OS, I'll ping some of the engineering folks over there to see if they have any input on best practice.
Your second point is something we (I'm the ACS PM) are looking at. There are some solutions you can use in certain scenarios (e.g. ArangoDB is in the DC/OS universe and will provide replication). The DC/OS team may have something to say here too. In ACS we are evaluating the best approaches to providing solutions for this use case but I'm afraid I can't give any indication of timeline.
An alternative solution is to have your database in a SaaS offering. This takes away all the complexity of managing redundancy and replication.
Bearing in mind that the ElasticSearch-Zookeeper plugin doesn't support v0.90 release.
With unicast, what's your strategy on updating your list of IPs? I.e. when upgrading/scaling-up/down.
What client-side connectivity (from web/worker role) to the cluster? Do you:
a) implement your own round-robin/failover implementation across all nodes in the cluster
b) spin up a local (non-data/non-master) elasticsearch process on the client machine that joins the unicast cluster. The application will only connect to localhost
Where do you store your data? Azure blob gateway?
Can you share your detailed story on your ElasticSearch experience on azure, and any particular points/issues to watch out for?
Cheers
Just a note about this. We are on the way on releasing Azure plugin for Elasticsearch. It will help to allow automatic discovery of your Elasticsearch nodes. I think that we will have something public in the next weeks.
Also, I recommend to use local storage. Azure blob will be used in the future to allow snapshots (and restore) feature when Elasticsearch 1.0 will be out.
Hope this helps
Update: Plugin is now available here: https://github.com/elasticsearch/elasticsearch-cloud-azure
It's in no way certified by ElasticSearch, but I wrote a blog post about my experience running ES on Azure: http://thomasardal.com/running-elasticsearch-in-a-cluster-on-azure/
I recently got a trial version of Windows Azure and wanted to know if there is any way I can deploy an application using Cassandra.
I can't speak specifically to Cassandra working or not in Azure unfortuantly. That's likely a question for that product's development team.
But the challenge you'll face with this, mySQL, or any other role hosted database is persistence. Azure Roles are in and of themselves not persistent so whatever back end store Cassandra is using would need to be placed onto soemthing like an Azure Drive (which is persisted to Azure Blob Storage). However, this would limit the scalability of the solution.
Basically, you run Cassandra as a worker role in Azure. Then, you can mount an Azure drive when a worker starts up and unmount when it shuts down.
This provides some insight re: how to use Cassandra on Azure: http://things.smarx.com/#Run Cassandra
Some help w/ Azure drives: http://azurescope.cloudapp.net/CodeSamples/cs/792ce345-256b-4230-a62f-903f79c63a67/
This should not limit your scalability at all. Just spin up another Cassandra instance whenever processing throughput or contiguous storage become an issue.
You might want to check out AppHarbor. AppHarbor is a .Net PaaS built on top of Amazon. It gives users the portability and infrastructure of Amazon and they provide a number of the rich services that Azure offers such as background tasks & load balancing plus some that it doesn't like 3rd party add-ons, dead-simple deployment and more. They already have add-ons for CouchDB, MongoDB and Redis if Cassandra got high enough on the requested features I'm sure they could set it up.