Mixing DS with D1_V2 VM's in Windows Azure

Mixing DS with D1_V2 VM's in Windows Azure - azure

I'm designing an Azure solution for a webapplication that requires 2 VM's, a web-tier and an database-tier.
The web-tier contains the webapp that is a relatively large amount of calculation-work. The database-tier is a normal SQL server instance (+- 100 databases, total 500GB).
Azure offers the DS-series and D1_V2 series, the DS series supports SSD drives, the D1_V2 doesn't but has a 35% CPU than the DS.
Is my reasoning solid in thinking that I will be better off combining the 2 series, using the DS for the database-tier (SSD will provide higher IOPS for the database), while the D1_V2 will offer faster processing for my calculation-heavy webapp.
Any thoughts? Thanks!

Yes, you can combine those, because these are two completely different/standalone VMs.
It also makes sense that you use the V2 for your web server (due to its calculations) and a DS-series for you database server.
You should also use the Local SSD of you SQL Server machine to boost performance, e.g. by moving the temp db to it and/or setting up Buffer Pool Extensions and target the local SSD.

Your reasoning is solid and that may very well be the best way to deploy your solution.
However your methodology is perhaps rather flawed. What you say makes perfect sense, but you will never know whether that has any basis in reality until you test your application and see where the bottle necks are.
Considering how easy it is to scale a VM up or down, it doesn't take much to deploy and monitor and adjust accordingly.
In principal what you suggest is fine but reality crushes many a good principle.

Related

Is Amazon EC2 free tier server appropriate for my little web application?

I'm building a little software activation web service in Java, so I need a cloud-based server which will run Apache and Tomcat and MySQL.
It will get very little usage as I don't expect to sell very much product at first. I'll be very lucky if the server handles one quick activation a day ... if it got 20 in a day that would be an amazing success.
I'm looking at Amazon EC2 pricing here ...
https://aws.amazon.com/free/?all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc
I see that there is a "Free Tier" which provides "750 hours per month of Linux t2.micro or t3.micro instance". And it's free for year.
STUPID QUESTION #1: 24h/day x 31 days/month is 744 hours ... so, does that mean I'm getting a free linux server running 24/7 for a year or is there a catch that I'm missing?
STUPID QUESTION #2: t2.micro/t2.micro has 1 vCPU, 1GB Memory ... is that enough power to run a simple Apache + Tomcat + MySQL web service reliably?
STUPID QUESTION #3: Any reason why I should skip the free tier and invest in a powerful pay $$$ option?

Yes. No catch. It's just not a very strong server.
That really depends on what that service does. Performance wise you need to pay attention to t2 instances being optimized for burst operations. That means they run full speed for a little while and then get throttled. But if you're talking about reliability, it's a whole other story. Just one machine is usually not enough for that. You want multiple machines in multiple data centers. What if one machine goes down? What if the whole data center goes down? It really depends on just how reliable you want it.
That really depends on what you're looking for. If you don't know yet, stick to free until you figure it out. I would even go for something simpler like Heroku at first. At least you won't have to take care of the reliability aspect as much.

You describe your service as: "Accept an encrypted license key, decrypt it, verify it, return and encrypted boolean response".
This sounds like an excellent candidate for a serverless solution:
AWS API Gateway providing an HTTPS endpoint that the application can call
It then triggers an AWS Lambda function that performs the logic and exits
However, you also mention a MySQL database. This could be provided by Amazon RDS. Or, you could go serverless and use DynamoDB (a NoSQL database).
The benefit of a serverless architecture is that it can scale to handle high loads and doesn't cost you anything (except potentially for the database) when not being used.
There is a free tier available for AWS API Gateway, AWS Lambda, Amazon DynamoDB and Amazon RDS.

There might be a limitation on network traffic for EC2 instances. You should look into that before deciding to host a web service on it. There is even a possibility it could charge you for using too much network bandwidth, so scalability might be an issue. I suggest you try Heroku instead, and then switch to other app hosting services when if and when you need to scale.

Yes, i have developed an low to medium web application as mysql backend.But, please be sure about number of users , as it depends on the performance and scalability.

If you are looking for very little usage EC2 is the best matching free tire which provides by the AWS.
The EC2 Micro instances to keep under the AWS Free Tier, which covers 750 hours of t2. micro instances. And the servers are available Linux as well as windows
When we talking about the second question it depends on your application type. As per the question that you asked 8GB is enough to run your apache and SQL.
But when it comes to reliability, it's a different story. In most cases, one machine is insufficient. You'd like to have multiple machines in different data centers. So, in that case, it is better to move to another service.
When we talking about your 3rd question, it also depends on the applicability of your application. If your application having a high number of users and many concurrent processes and if you need to improve the reliability, it is good to move to pay subscriptions.

what type of azure resource should i use for hosting many database

I have a project where i need to host many Databases, (500 and up)
and i am trying to find what is the best option to manage everything considering all the options this days and the price.
in the past i would have a virtual server that has SQL-SERVER on it, and i would create the database on my own, and that is all.
but today
i host my current project on AZURE, a simple web server, with SQL server, with one database.
and i do not know what Resource to choose from AZURE
is it the SQL Ware House? or do i need to get a Virtual Machine?
or any other option?
i read all the information i found online, but its mostly confused me.
i hope some one could help me, i would like to know from your experience
thank you in advanced

It all very much depends on size and load of databases. You have 3 options - you can get a VM yourself and have a SQL server there. You are pretty much in control of what is happening and you can host as many DBs as you want. However you'll be in charge of backups, updates and maintenance. But this is a pretty much fixed price.
Another option is to get SQL Server from Azure - you don't need to think much about backups, encryption, updates and other boring stuff and you can get. You can have up to 5000DBs per server, but you can choose size and performance tier of your databases. However that can be expensive, as you are charged per DB.
Third option is to have Elastic Pool - this is basically a pile of DBs that are sharing the same resource. Can be useful if you have a lot of small DBs with small load. This will work out cheaper than just paying per DB on your scale. However might not work if you have very uneven load on some DBs - they can consume all the DTUs and will starve the rest of your DBs from processing power.
So it is up to you what you want to do based on your conditions. Personally I would not go with a VM - too much hassle. I would recommend considering (based on DBs load) a combination of Elastic Pool and a stand-alone DBs.

In-memory caching in Azure function

There is a need to cache objects to improve the perf of my Azure function. I tried .NET ObjectCache (System.Runtime.Caching) and it worked well in my testing (tested with upto 10min cache retention period).
In order to take this solution forward, I have few quick questions:
What is the recycling policy of Azure function. Is there any default? Can it be configured?
What is the implication in the cost?
Is my approach right or are there any better solutions?
Any questions that you may know, please help.
Thank you.

Javed,
An out-of-process solution such as Redis (or even using Table storage, depending on the workload) would be recommended.
As a rule of thumb, functions should be stateless, particularly if you're running in the dynamic runtime, where scaling operations (up and down) could happen at any time and your host is not guaranteed to stay up.
If you opt to use the classic hosting, you do have a little more flexibility, as you can enable the "always on" feature, but I'd still recommend the out-of-process approach. Running in the classic mode does have a cost implication as well, since you're no longer taking advantage of the consumption based billing model offered by the dynamic hosting.
I hope this helps!

If you just need a smallish key-value cache, you could use the file system. D:\HOME (also found in the environment variable %HOME%) is shared across all instances. I'm not sure if the capacities are any different for Azure Functions, but for Sites and WebJobs, Free and Shared sites get 1GB of space, Basic sites get 10GB, and Standard sites get 50GB.
Alternatively, you could try running .NET ObjectCache in production. It may survive multiple calls to the same instance (file system or static in-memory property). Note, this will not be shared across instances though so only use it as a best effort cache.
Note, both of these approaches pose problems for multi-tenant products as it could be an avenue for unintended cross-tenant data sharing or even more malicious activities like DNS cache poisoning. You'd want to implement authorization controls for these things just as if they came from a database.
As others have suggested, Functions ideally should be stateless and an out of process solution is probably best. I use DocumentDB because it has time-to-live functionality which is ideal for a cache. Redis is likely to be more performant especially if you don't need persistence across stop/restart.

New Azure SQL Database Services, how scalable and what are DTUs

The new new Azure SQL Database Services look good. However I am trying to work out how scalable they really are.
So, for example, assume a 200 concurrent user system.
For Standard
Workgroup and cloud applications with "multiple" concurrent transactions
For Premium
Mission-critical, high transactional volume with "many" concurrent users
What does "Multiple" and "Many" mean?
Also Standard/S1 offers 15 DTUs while Standard/S2 offers 50 DTUs. What does this mean?
Going back to my 200 user example, what option should I be going for?
Azure SQL Database Link
Thanks
EDIT
Useful page on definitions
However what is "max sessions"? Is this the number of concurrent connections?

There are some great MSDN articles on Azure SQL Database, this one in particular has a great starting point for DTUs. http://msdn.microsoft.com/en-us/library/azure/dn741336.aspx and http://channel9.msdn.com/Series/Windows-Azure-Storage-SQL-Database-Tutorials/Scott-Klein-Video-02
In short, it's a way to understand the resources powering each performance level. One of the things we know when talking with Azure SQL Database customers, is that they are a varied group. Some are most comfortable with the most absolute details, cores, memory, IOPS - and others are after a much more summarized level of information. There is no one-size fits all. DTU is meant for this later group.
Regardless, one of the benefits of the cloud is that it's easy to start with one service tier and performance level and iterate. In Azure SQL Database specifically you can change the performance level while you're application is up. During the change there is typically less than a second of elapsed time when DB connections are dropped. The internal workflow in our service for moving a DB from service tier/performance level follows the same pattern as the workflow for failing over nodes in our data centers. And nodes failing over happens all the time independent of service tier changes. In other words, you shouldn’t notice any difference in this regard relative to your past experience.
If DTU's aren't your thing, we also have a more detailed benchmark workload that may appeal. http://msdn.microsoft.com/en-us/library/azure/dn741327.aspx
Thanks Guy

It is really hard to tell without doing a test. By 200 users I assume you mean 200 people sitting at their computer at the same time doing stuff, not 200 users who log on twice a day. S2 allows 49 transactions per second which sounds about right, but you need to test. Also doing a lot of caching can't hurt.

Check out the new Elastic DB offering (Preview) announced at Build today. The pricing page has been updated with Elastic DB price information.

DTUs are based on a blended measure of CPU, memory, reads, and writes. As DTUs increase, the power offered by the performance level increases. Azure has different limits on the concurrent connections, memory, IO and CPU usage. Which tier one has to pick really depends upon
#concurrent users
Log rate
IO rate
CPU usage
Database size
For example, if you are designing a system where multiple users are reading and there are only a few writers, and if your application middle tier can cache the data as much as possible and only selective queries / application restart hit the database then you may not worry too much about the IO and CPU usage.
If many users are hitting the database at the same time, you may hit the concurrent connection limit and requests will be throttled. If you can control user requests coming to the database in your application then this shouldn't be a problem.
Log rate: Depends upon the volume of the data changes (including additional data pumping in the system). I have seen application steadily pumping the data vs data being pumped all at once. Selecting the right DTU again depends upon how one can do throttling at the application end and get steady rate.
Database size: Basic, standard, and premium has different allowed max sizes, and this is another deciding factor. Using table compression kind of features helps reducing the total size, and hence total IO.
Memory: Tuning the expesnive queries (joins, sorts etc), enabling lock escalation / nolock scans help controlling the memory usage.
The very common mistake people usually do in database systems is scaling up their database instead of tuning the queries and application logic. So testing, monitoring the resources / queries with different DTU limits is the best way of dealing this.
If choose the wrong DTU, don't worry you can always scale up/ down in SQL DB and it is completely online operation
Also unless a strong reason migrate to V12 to get even better performance and features.

Is storing data in Windows Azure cheaper when using RavenDB rather than SQL Azure?

SQL Azure storage is a lot more expensive than Windows Azure Storage. Would implementing a no-sql solution like RavenDB allow me to store data on the cheaper Azure Storage?
Are there other things to consider, like backup, speed or security?
Thank you.

You have to consider that with SQL Azure you not only get the storage, but the database server too. If you implement RavenDB, you will will need a worker role to host it in and, in order to allow for failure of that worker role, another worker role (replica), which also doubles up the storage.
Bear in mind that with SQL Azure you get a highly available (3x replicated with failover) SQL solution that surfaces a familiar (ADO.NET) API. Make your choices based on aspects other than storage cost, such as operational effort and development effort. If you choose RavenDB it should be because of the potential cost savings in development effort (because of the closeness on the document API to the object graph) and operational cost, because RavenDB is 'administered' as part of the application. Cost of storage of actual bytes, particularly at scale, is a marginal consideration.

Adding a bit to #Simon's answer: When considering Table Storage and its low cost, also consider whether you can use it directly, instead of going with an installed-and-managed-by-you NoSQL database engine. As it stands, Table Storage offers a schemaless solution that lets you store essentially a property bag within a row, indexed by partitionkey+rowkey. Does that work for you? Could you work with a few extra tables to give you additional indexing? If so, your storage cost is going to be really low (and still durable, triple-replicated).
If you find yourself writing significant code to manage Table Storage, then it may be more efficient to invest in the Compute instances needed to run RavenDB. When considering this, also consider that you'll likely want larger VM sizes if you're moving significant data (as you get approx. 100Mbps per core). A database like MongoDB, working with memory-mapped files, really ramps up speed-wise with more RAM. Not sure if this is the same with RavenDB.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string