How to scale out Azure Storage Queues

How to scale out Azure Storage Queues - azure

I am wondering how Azure handles the geographic distribution of Storage Queues?
If I have a storage queue setup in one region and then I want to scale out to other regions, what happens? Do I need to write code to handle the Queues separately?
For example Amazon Web Services have DynamoDB which is globally distributed out of the box and will provide the same performance everywhere.

I think a more logical comparison would be between Windows Azure Tables and DynamoDB. That said:
Windows Azure queues are assigned to a specific data center, and you can create additional queues in other data centers. Typically you'd place your queue in the same DC as your cloud service working with the queue, but there's no requirement there (you'll get better performance and no outbound bandwidth charges when you access same-DC queues).
DynamoDB, from what I've read here, has the same model: Choose your data center for a table. Data is distributed across servers in the same region, not multiple regions (in other words, if you choose N. Virginia, that's where your data access point is).
Regarding your statements of DynamoDB "being globally distributed out of the box" and providing "the same performance everywhere" - I don't think that's the case (at least, I can't find any evidence supporting that assertion). Rather, DynamoDB is replicated to additional data centers for fault tolerance, as is Windows Azure Storage.
Bottom line: you'd have to manage resources allocated to multiple data centers, whether Windows Azure Tables, Windows Azure Queues, or DynamoDB.

Related

Clarification on Azure SQL Database Backup plan (short term retention)

I am confused with azure SQL database backup plan (short term backup retention).
As far as i understood,
In DTU purchasing model, no extra charge for backup storage, you only pay for redundancy type (such as LRS,ZRS)
In vCore purchase model, you will have to pay for backup storage.
am i right ?
does that mean , i will not have any backups if do not subscribe to backup storage in vCore ?
further, in azure pricing calculator, in vCore, General purpose option, you have two redundancy drop down options (i am not talking about long term retention plan) , what is the difference between them ?
Thanks.

i will not have any backups if do not subscribe to backup storage in vCore ?
Yes, in vCore, if you do not allocate a storage account for backups, you will not be able to perform backup operations, either manually or automatically. If you believe you do not need backups, then you might be a fool ;), Azure will maintain access to your database according to the standard SLAs but the infrastructure will not provide a way for you to point-in-time restore the state of your database, only backups can adequately do that for you. But the storage costs are usually a very minimal component of your overall spend. Once the backup operation is complete you can download the backup for local storage and then clear the blob, making this aspect virtually cost free, but you will need a storage account to complete the backup process at all.
in azure pricing calculator, in vCore, General purpose option, you have two redundancy drop down options
Are you referring to the Computer Redundancy:
Zone redundancy for Azure SQL Database general purpose tier
The zone redundant configuration utilizes Azure Availability Zones to replicate databases across multiple physical locations within an Azure region. By selecting zone redundancy, you can make your serverless and provisioned general purpose single databases and elastic pools resilient to a much larger set of failures, including catastrophic datacenter outages, without any changes of the application logic. This configuration offers 99.995% availability SLA and RPO=0. For more information see general purpose service tier zone redundant availability.
In the other tiers, these redundancy modes are referred to as LRS (Locally Redundant) and ZRS (Zone Redundant). Think of this your choice on what happens when your data centre is affected by some sort of geological or political event that means the server cluster, pod or whole data centre is offline.
Locally Redundant offers redundancy only from a geographically local (often the same physical site). In general this protects from local hardware failures but not usually against scenarios that take the whole data center off-line. This is the minimal level of redundancy that Azure requires for their hardware management and maintenance plans.
Zone Redundant offers redundancy across multiple geographically independent zones but still within the same Azure Region. Each Azure availability zone is an individual physical location with its own independent networking, power, and cooling. ZRS provides a minimum of 99.9999999999% durability for objects during a given year.
There is a third type of redundancy offered in higher tiers: Geo-Redundant Storage (GRS). This has the same Zone level redundancy but configures additional replicas in other Azure regions around the world.
In the case of Azure SQL DB, these terms for Compute (So the actual server and CPU) have almost identical implications as that of Storage Redundancy. So with regard to available options, the pricing calculator is pretty well documented for everything else, use the info tips for quick info and go to the reference pages for the extended information:
The specifics are listed here: Azure Storage redundancy but redundancy in Azure is achieved via replication. That means that an entire workable and usable version of your database is maintained so that in the event of failure, the replica takes the load.
A special feature of replication is that you can actively utilise the replicated instance for Read Only workloads, which gives us as developers and architects some interesting performance opportunities for moving complex reporting and analytic workloads out of the transactional data manipulations OOTB, traditionally this was a non-trivial configuration.
The RA prefix on redundancy options is an acronym for Read Access.

How to store temporary data in an Azure multi-instance (scale set) virtual machine?

We developed a server service that (in a few words) supports the communications between two devices. We want to make advantage of the scalability given by an Azure Scale Set (multi instance VM) but we are not sure how to share memory between each instance.
Our service basically stores temporary data in the local virtual machine and these data are read, modified and sent to the devices connected to this server.
If these data are stored locally in one of the instances the other instances cannot access and do not have the same information. Is it correct?
If one of the devices start making some request to the server the instance that is going to process the request will not always be the same so the data at the end is spread between instances.
So the question might be, how to share memory between Azure instances?
Thanks

Depending on the type of data you want to share and how much latency matters, as well as ServiceFabric (low latency but you need to re-architect/re-build bits of your solution), you could look at a shared back end repository - Redis Cache is ideal as a distributed cache; SQL Azure if you want to use a relation db to store the data; storage queue/blob storage - or File storage in a storage account (this allows you just to write to a mounted network drive from both vm instances). DocumentDB is another option, which is suited to storing JSON data.

You could use Service Fabric and take advantage of Reliable Collections to have your state automagically replicated across all instances.
From https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-services-reliable-collections/:
The classes in the Microsoft.ServiceFabric.Data.Collections namespace provide a set of out-of-the-box collections that automatically make your state highly available. Developers need to program only to the Reliable Collection APIs and let Reliable Collections manage the replicated and local state.
The key difference between Reliable Collections and other high-availability technologies (such as Redis, Azure Table service, and Azure Queue service) is that the state is kept locally in the service instance while also being made highly available.
Reliable Collections can be thought of as the natural evolution of the System.Collections classes: a new set of collections that are designed for the cloud and multi-computer applications without increasing complexity for the developer. As such, Reliable Collections are:
Replicated: State changes are replicated for high availability.
Persisted: Data is persisted to disk for durability against large-scale outages (for example, a datacenter power outage).
Asynchronous: APIs are asynchronous to ensure that threads are not blocked when incurring IO.
Transactional: APIs utilize the abstraction of transactions so you can manage multiple Reliable Collections within a service easily.
Working with Reliable Collections -
https://azure.microsoft.com/en-us/documentation/articles/service-fabric-work-with-reliable-collections/

Azure Traffic Manager for Cloud Services - What about storage access?

I have finally got the time to start looking at Azure. It's looks good and easy scaling.
Azure SQL, Table Storage and Blog Storage should cover most of my things. Fast access to data, auto replication and failover to an other datacenter.
Should the idea come for an app that needs fast global access the Traffic manager is there and one can route users for "Fail Over" or "Performance".
The "performance" is very nice for Cloud Services and "Web Roles / Worker Roles" ... BUT ... What about access to data from SQL Azure/Table Storage/Blog Storage.
I have tried searching the web(for what to do about this need), but haven't found anything about the traffic manager that mentions anything about how to access data in such a scenario.
Have I missed anything?
Do people access the storage in the original data center (and if that fails use the Geo Replication feature)? Is that fast enough? Is internal traffic on the MS network free across datacenters?
This seems like such a simple ...

Take a look at the guidance by Microsoft: Replicating, Distributing, and Synchronizing Data. You could use the Service Bus to keep data centers in Sync. This can cover SQL Databases, Storage, search indexes like SolR, ElasticSearch, ... The advantage over solutions like SQL Data Sync is that it's technology independent and it can keep virtually all your data in sync:

In this episode of Channel 9 they state that Traffic Manager is only for Cloud Services as of now (Jan 2014) but support is coming for Azure Web Sites and other services. I agree that you should be able to ask for a Blob using a single global URL and expect that the content will be served from the closest datacenter.

There isn't a one-click easy to implement solution for this issue. The way you solve it will depend on where the data lives (ie. SQL Azure, Blob storage, etc) and your access patterns.
Do you have a small number of data requests that are not on a performance critical path in your code? Consider just using the main datacenter.
Do you have a large number of read-only type of requests? Consider doing a replication of the data to another datacenter.
Do you do a large number of read and only a few write operations? Consider duplicating the data among all datacenters and each write will write to all datacenters at the same time (incurring a perf penalty) and do all reads to the local datacenter (fast reads).
Is your data in SQL Azure? Consider using SQL Data Sync to keep multiple datacenters in sync.

SQL Azure reliability and scalability

I need to make sure the availability of my database is high. working with SQL Azure does not make that clear.
Is there a way to run multi servers (one will take over if one server fails? ) under SQL Azure, above that is there something equivalent to increasing memory on the DB server to speed up the Database processing ?

Read High Availability on the Intro the Azure SQL and then read Business Continuity in Windows Azure SQL Database. To summarize:
Data durability and fault tolerance is enhanced by maintaining
multiple copies of all data in different physical nodes located across
fully independent physical sub-systems such as server racks and
network routers. At any one time, Windows Azure SQL Database keeps
three replicas of data running—one primary replica and two secondary
replicas.

Right now there is no way to specify hardware configuration for SQL Azure Databases. It's totally out of your control and from SAAS perspective that makes sense. The backend management services are responsible making sure you get the best performance possible.
If you need dedicated and reserved hardware for your SQL deployment you may take a look at IAAS offerings in Azure and start a VM with SQL installed however you need to make sure you know the main differences between a IAAS and PAAS offering.

I do not know what your high availability requirements are, but you should look at the SLAs provided by Microsoft. SQL Database offers 99.9% monthly availability.

How do I make my Windows Azure application resistant to Azure datacenter catastrophic event?

AFAIK Amazon AWS offers so-called "regions" and "availability zones" to mitigate risks of partial or complete datacenter outage. Looks like if I have copies of my application in two "regions" and one "region" goes down my application still can continue working as if nothing happened.
Is there something like that with Windows Azure? How do I address risk of datacenter catastrophic outage with Windows Azure?

Within a single data center, your Windows Azure application has the following benefits:
Going beyond one compute instance, your VMs are divided into fault domains, across different physical areas. This way, even if an entire server rack went down, you'd still have compute running somewhere else.
With Windows Azure Storage and SQL Azure, storage is triple replicated. This is not eventual replication - when a write call returns, at least one replica has been written to.
Ok, that's the easy stuff. What if a data center disappears? Here are the features that will help you build DR into your application:
For SQL Azure, you can set up Data Sync. This facility synchronizes your SQL Azure database with either another SQL Azure database (presumably in another data center), or an on-premises SQL Server database. More info here. Since this feature is still considered a Preview feature, you have to go here to set it up.
For Azure storage (tables, blobs), you'll need to handle replication to a second data center, as there is no built-in facility today. This can be done with, say, a background task that pulls data every hour and copies it to a storage account somewhere else. EDIT: Per Ryan's answer, there's data geo-replication for blobs and tables. HOWEVER: Aside from a mention in this blog post in December, and possibly at PDC, this is not live.
For Compute availability, you can set up Traffic Manager to load-balance across data centers. This feature is currently in CTP - visit the Beta area of the Windows Azure portal to sign up.
Remember that, with DR, whether in the cloud or on-premises, there are additional costs (such as bandwidth between data centers, storage costs for duplicate data in a secondary data center, and Compute instances in additional data centers). .
Just like with on-premises environments, DR needs to be carefully thought out and implemented.

David's answer is pretty good, but one piece is incorrect. For Windows Azure blobs and tables, your data is actually geographically replicated today between sub-regions (e.g. North and South US). This is an async process that has a target of about a 10 min lag or so. This process is also out of your control and is purely for a data center loss. In total, your data is replicated 6 times in 2 different data centers when you use Windows Azure blobs and tables (impressive, no?).
If a data center was lost, they would flip over your DNS for blob and table storage to the other sub-region and your account would appear online again. This is true only for blobs and tables (not queues, not SQL Azure, etc).
So, for a true disaster recovery, you could use Data Sync for SQL Azure and Traffic Manager for compute (assuming you run a hot standby in another sub-region). If a datacenter was lost, Traffic Manager would route to the new sub-region and you would find your data there as well.

The one failure that you didn't account for is in the ability for an error to be replicated across data centers. In that scenario, you may want to consider running Azure PAAS as part of HP Cloud offering in either a load balanced or failover scenario.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string