Can we use azure blob storage for scylladb backups? - azure

As per the scylladb doc scylla-manager is the tool we should use to take our backups.
But as per the latest version of scylla-manager (ie 2.2) it only seems to support AWS S3 and Google Cloud Storage Bucket.
Is there some way to use scylla-manager to upload our backups to Azure Blobstorage ?
or
any other way which is at least equally efficient as scylla-manager to upload backups to Azure Blobstorage ?

We just coded and merged it, it's under tests now. Will be part of manager 2.3

Azure Blobstorage is now supported as a backup location using Scylla Manager 2.4, which was released last week.
Release notes
Setting up AzureBlobStorage as your backup location
Scylla Manager 2.4 also provides an ansible playbook to automate the restore from backup
Note: Scylla Manager metrics undergo some refactoring, hence you will need Scylla Monitoring 3.8 (docs / Repo tag) in order to view the Manager metrics in the Manager dashboard in Scylla Monitoring.

Azure Blobstorage is not yet integrated into Scylla Manager. While it is in the near-term roadmap, we do not yet have it associated with a particular release or date.

Related

Impact of upgrading BlobStorage to general purpose v2 storage

We were already using BlobStorage in Openshift cluster as PVs for applications and for other micro services. I just came to know, Point-in-time restore only supports for general purpose v2 storage account. So, before updgrading to general purpose v2 storage account, I just want to know what are impacts like access URL change for storage account's container
There are no impacts if BlobStorage account is upgraded to General-purpose v2 account. The container / blob urls are still the same after upgraded.
General-purpose v2 account contains all the features in the legacy BlobStorage account. This is mentioned in the doc, adding the screenshot of the doc:
There is no issue on upgradation to the existing file and objects. It will enable the additional new feature of the v2 version. But if you are accessing application programmatically then you needed to upgrade your package otherwise you may get exception .

Is the Azure Storage SDK for Java V8 the latest version to work with Table Storage?

We are using the Azure Storage SDK for Java V8 in an application to access a simple table in a storage account with few lines, this version is failing the customer security scan, but the new version of the SDK doesn't seems to work with table storage, checking the documentation, it appears that the only way to use a table storage with the newest SDK is accessing through the Cosmos DB Table API.
The Azur Storage for Java SDK page (https://github.com/azure/azure-storage-java) shows only Blob and Queue components. The Azure documentation is not clear on this but all samples point to V8 (or older) SDKs, like this sample https://github.com/Azure-Samples/storage-table-java-getting-started
Is there any way to access a table storage in Java without using the Cosmos DB Table API or the outdated V8 SDK?
For now, Azure Tables And any client libraries for Table have actually moved over to the CosmosDB team ,and there are no plans to support them.
And about the table support issue in the github has been closed.
So if you want to use Table SDK with Java, you have to use the legacy SDK. Now you could only get the latest SDK for table is CosmosDB.
Hope this could help you, if you still have other questions,please let me know.

How to connect locally installed Apache Hive to Azure datalake?

I have installed Apache Hive on my local system and I need to connect to Azure Data Lake to query the data from it. How to configure it?
Details on how you can connect Hadoop to Azure Data Lake are available here - https://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html.
You will need to have a recent version of Hadoop running in order to have the modules natively available.
There are blogs which talk about enabling this connectivity e.g. - https://medium.com/azure-data-lake/connecting-your-own-hadoop-or-spark-to-azure-data-lake-store-93d426d6a5f4.
But unless you are running Hadoop in an Azure Region where the Azure Data Lake Store (ADLS) account is located, your solution will be non-optimal. You will incur latency in data read/writes, as well as costs since you will be egressing data out of an Azure region during reads. Trust you have factored these into your planning.
Thanks,
Sachin Sheth,
Program Manager, Azure Data Lake.

Why AWS and Azure doesn't provide snapshot option to revert the server?

I have worked with EXSi Servers lot of times.They provide snapshot option which can be used to revert back the same server to any point of snapshot that we taken.
I was unable to find the same in AWS and Azure.These cloud enterprises provide the option to backup the server.
AWS backups the whole volume.
Azure provide vault800 backup wizard which is incremental.
We can create a new Server with that backup, but we cannot revert back the same server.The EXSi Server take snapshot 10% of 100% volume of server and revert back as per our requirement.
For Azure, take a look at blob snapshots.
Azure Storage provides the capability to take snapshots of blobs. Snapshots capture the blob state at that point in time.
Pretty much the same story with AWS:
You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time snapshots. Snapshots are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved
how about using a 3rd party backup solution like Veeam or cloudberry to take image based backup copies and replicate them onto preferred cloud storage.
Veeam also supports instant VM recovery, you can immediately restore a VM into your production environment by running it directly from the backup file. Instant VM recovery helps improve recovery time objectives (RTO), minimise disruption and downtime of production VMs. It is like having a "temporary spare" for a VM: users remain productive while you can troubleshoot an issue with the failed VM.

Upgrade/Migrate HDInsight Cluster to Last Version

I'm sure this is posted somewhere or has been communicated but I just can't seem to find anything about upgrading/migrating from a HDInsight cluster from one version to the next.
A little background. We've been using Hive with HDInsight to store all of our IIS logs since 1/24/2014. We love it and it provides good insight to our teams.
I recently was reviewing http://azure.microsoft.com/en-us/documentation/articles/hdinsight-component-versioning/ and noticed that our version of HDInsight (2.1.3.0.432823) is no longer supported and will be deprecated in May. That got me to thinking about how to get onto version 3.2. I just can't seem to find anything about how to go about doing this.
Does anyone have any insight into if this is possible and if so how?
HDInsight uses Azure Storage for persistent data, so you should be able to create a new cluster and point to the old data, as long as you are using wasb://*/* for your storage locations. This article has a great overview of the storage architecture: http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-blob-storage/
If you are using Hive and have not set up a customized metastore, then you may need to save or recreate some of the tables. Here's a blog post that covers some of those scenarios: http://blogs.msdn.com/b/bigdatasupport/archive/2014/05/01/hdinsight-backup-and-restore-hive-table.aspx
You can configure a new cluster and add the existing cluster's storage container as an "additional" storage account to test this out without first taking down the current cluster. Just be sure not to have both clusters using the same container as their default storage.

Resources