Troubleshooting Azure Search poor performance

Troubleshooting Azure Search poor performance - azure

I am seeing erratic performance with an Azure Search Basic instance. Our index only has 1,544 documents and is 28MB in size, so I would expect searches to be very fast.
Azure Application Insights is reporting 4.7K calls to Azure Search from our app within the last 12 hours, with an average response time of 2.1s and a standard deviation of 35.8s(!).
I am personally seeing erratic performance during my manual testing. A query can take 20+ seconds at one moment, and then just a bit later the same query will take less than 100ms.
There queries are very simple. Here's an example query string:
api-version=2015-02-28&api-key=removed&search=&%24count=true&%24top=10&%24skip=0&searchMode=all&scoringProfile=FieldBoost&%24orderby=sortableTitle
What can I do to further troubleshoot this issue?

First off, I am assume you have a fairly even distribution of queries which means based on your numbers, you are only ~1 query per second. Does that sound correct? If not, and you are seeing large spikes of queries, it is very possible that you do not have enough replicas (copies of the index) to handle the query load. Please note that a single replica Basic service is targeted to handle low single digit QPS (although this can vary widely based on the complexity or simplicity of the queries). If you go beyond the limits of the service, latency can certainly become an issue. A good way to drill into this is to use Azure Search Traffic Analytics which can expose the search metrics that include data such as the number of queries per second over various timeframe as well as the latency metrics that we are seeing internally.
Also, most importantly, please try to reuse HTTP connections as much as possible and leverage HTTP connection pooling if possible. By the way, in .NET you should reuse a single HttpClient instance, or SearchIndexClient instance if using our Azure Search SDK.

I gathered more data and posted my results over at the Azure Search forum.
The slowdowns are due to the fact that we're running a single basic instance and code deployments by the Azure Search team cause a brief (a few minutes in my experience) interruption / degradation in service.
I find running two basic instances too expensive. Our search traffic doesn't warrant two instances except for availability purposes.
It's my understanding from the forum that the free tier has generally higher availability than a single basic instance. As a result, I have submitted a feedback item suggesting a paid shared tier that would provide more storage than the free tier while retaining higher availability than a single dedicated instance.

Related

Microservices With CosmosDB on Azure

I've read a bit about microservices and the favored approach appears to be a separate database for each microservice. With regards to Azure's CosmosDB, would that mean a separate Table for each service? What's the best way to architect this?

There are a huge variety of factors to consider here which ultimately means there is no right answer to this question and it will be very specific to the nature of the application you're trying to build. As such, broad statements attempting to offer "general" advice and patterns should be taken with a huge grain of salt. With Cosmos a few of the many high level things to consider when making your decisions are as follows:
Partitioning: Cosmos collections support almost infinite scale based on the selection of an appropriate partition key. So, for example you could have a single collection and separate your services such that they each write to a distinct partition key. This would provide you with a form of service multi-tenancy which might be perfectly appropriate for your particular application. However, throughput is also scaled at the collection level so if certain services have much higher read and/or write requirements this may not work for you and could be an indication that that particular service should use it's own collection which can be scaled independently.
Cost: You're billed per collection with a minimum throughput requirement. Depending on the number and nature of your micro services this could result in exponentially higher costs for little gain.
Isolation: Again, depending on the nature of your application you might have a hard business requirement that data from different services be physically separate from each other which would force you to use separate collections.
The point that I'm trying to make here is that there is absolutely no right answer to this question. You need to weight the pros/cons very carefully in the context of the solution you are trying to build and select the approach that is right for you.

Azure SQL Performance

I have an Azure SQL Database S1 Standard (20DTU) and I'm seeing vast variations in performance. I have a number of queries that power a set of reports on a small web site. When running these queries through the Management Studio the performance varies from 0 to 60 seconds. The site isn't public so there's no traffic yet - only me. Looking at the DTU usage, it spikes at around 50%. Can anyone help me understand where the performance difference is coming from?

You can follow the link http://social.technet.microsoft.com/wiki/contents/articles/1104.troubleshoot-and-optimize-queries-with-azure-sql-database.aspx to troubleshoot your query performance. Enabling Query data store is another option if you are on V12.
There could be various factors that imapcts query performance, buffer pool, sql instance restarts because of azure maintenance (which clears buffer pool) etc.

New Azure SQL Database Services, how scalable and what are DTUs

The new new Azure SQL Database Services look good. However I am trying to work out how scalable they really are.
So, for example, assume a 200 concurrent user system.
For Standard
Workgroup and cloud applications with "multiple" concurrent transactions
For Premium
Mission-critical, high transactional volume with "many" concurrent users
What does "Multiple" and "Many" mean?
Also Standard/S1 offers 15 DTUs while Standard/S2 offers 50 DTUs. What does this mean?
Going back to my 200 user example, what option should I be going for?
Azure SQL Database Link
Thanks
EDIT
Useful page on definitions
However what is "max sessions"? Is this the number of concurrent connections?

There are some great MSDN articles on Azure SQL Database, this one in particular has a great starting point for DTUs. http://msdn.microsoft.com/en-us/library/azure/dn741336.aspx and http://channel9.msdn.com/Series/Windows-Azure-Storage-SQL-Database-Tutorials/Scott-Klein-Video-02
In short, it's a way to understand the resources powering each performance level. One of the things we know when talking with Azure SQL Database customers, is that they are a varied group. Some are most comfortable with the most absolute details, cores, memory, IOPS - and others are after a much more summarized level of information. There is no one-size fits all. DTU is meant for this later group.
Regardless, one of the benefits of the cloud is that it's easy to start with one service tier and performance level and iterate. In Azure SQL Database specifically you can change the performance level while you're application is up. During the change there is typically less than a second of elapsed time when DB connections are dropped. The internal workflow in our service for moving a DB from service tier/performance level follows the same pattern as the workflow for failing over nodes in our data centers. And nodes failing over happens all the time independent of service tier changes. In other words, you shouldn’t notice any difference in this regard relative to your past experience.
If DTU's aren't your thing, we also have a more detailed benchmark workload that may appeal. http://msdn.microsoft.com/en-us/library/azure/dn741327.aspx
Thanks Guy

It is really hard to tell without doing a test. By 200 users I assume you mean 200 people sitting at their computer at the same time doing stuff, not 200 users who log on twice a day. S2 allows 49 transactions per second which sounds about right, but you need to test. Also doing a lot of caching can't hurt.

Check out the new Elastic DB offering (Preview) announced at Build today. The pricing page has been updated with Elastic DB price information.

DTUs are based on a blended measure of CPU, memory, reads, and writes. As DTUs increase, the power offered by the performance level increases. Azure has different limits on the concurrent connections, memory, IO and CPU usage. Which tier one has to pick really depends upon
#concurrent users
Log rate
IO rate
CPU usage
Database size
For example, if you are designing a system where multiple users are reading and there are only a few writers, and if your application middle tier can cache the data as much as possible and only selective queries / application restart hit the database then you may not worry too much about the IO and CPU usage.
If many users are hitting the database at the same time, you may hit the concurrent connection limit and requests will be throttled. If you can control user requests coming to the database in your application then this shouldn't be a problem.
Log rate: Depends upon the volume of the data changes (including additional data pumping in the system). I have seen application steadily pumping the data vs data being pumped all at once. Selecting the right DTU again depends upon how one can do throttling at the application end and get steady rate.
Database size: Basic, standard, and premium has different allowed max sizes, and this is another deciding factor. Using table compression kind of features helps reducing the total size, and hence total IO.
Memory: Tuning the expesnive queries (joins, sorts etc), enabling lock escalation / nolock scans help controlling the memory usage.
The very common mistake people usually do in database systems is scaling up their database instead of tuning the queries and application logic. So testing, monitoring the resources / queries with different DTU limits is the best way of dealing this.
If choose the wrong DTU, don't worry you can always scale up/ down in SQL DB and it is completely online operation
Also unless a strong reason migrate to V12 to get even better performance and features.

How does Azure DocumentDB scale? And do I need to worry about it?

I've got an application that's outgrowing SQL Azure - at the price I'm willing to pay, at any rate - and I'm interested in investigating Azure DocumentDB. The preview clearly has distinct scalability limits (as described here, for instance), but I think I could probably get away with those for the preview period, provided I'm using it correctly.
So here's the question I've got. How do I need to design my application to take advantage of the built-in scalability of the Azure DocumentDB? For instance, I know that with Azure Table Storage - that cheap but awful highly limited alternative - you need to structure all your data in a two-step hierarchy: PartitionKey and RowKey. Provided you do that (which is nigh well impossible in a real-world application), ATS (as I understand it) moves partitions around behind the scenes, from machine to machine, so that you get near-infinite scalability. Awesome, and you never have to think about it.
Scaling out with SQL Server is obviously much more complicated - you need to design your own sharding system, deal with figuring out which server the shard in question sits on, and so forth. Possible, and done right quite scalable, but complex and painful.
So how does scalability work with DocumentDB? It promises arbitrary scalability, but how does the storage engine work behind the scenes? I see that it has "Databases", and each database can have some number of "Collections", and so forth. But how does its arbitrary scalability map to these other concepts? If I have a SQL table that contains hundreds of millions of rows, am I going to get the scalability I need if I put all this data into one collection? Or do I need to manually spread it across multiple collections, sharded somehow? Or across multiple DB's? Or is DocumentDB somehow smart enough to coalesce queries in a performant way from across multiple machines, without me having to think about any of it? Or...?
I've been looking around, and haven't yet found any guidance on how to approach this. Very interested in what other people have found or what MS recommends.

Update: As of April 2016, DocumentDB has introduced the concept of a partitioned collection which allows you scale-out and take advantage of server-side partitioning.
A single DocumentDB database can scale practically to an unlimited amount of document storage partitioned by collections (in other words, you can scale out by adding more collections).
Each collection provides 10 GB of storage, and an variable amount of throughput (based on performance level). A collection also provides the scope for document storage and query execution; and is also the transaction domain for all the documents contained within it.
Source: http://azure.microsoft.com/en-us/documentation/articles/documentdb-manage/
Here's a link to a blog post I wrote on scaling and partitioning data for a multi-tenant application on DocumentDB.

With the latest version of DocumentDB, things have changed. There is still the 10GB limit per collection but in the past, it was up to you to figure out how to split up your data into multiple collections to avoid hitting the 10 GB limit.
Instead, you can now, specify a partition key and DocumentDB now handles the partitioning for you e.g. If you have log data, you may want to partition the data on the date value in your JSON document, so that each day a new partition is created.

You can fan out queries like this - http://stuartmcleantech.blogspot.co.uk/2016/03/scalable-querying-multiple-azure.html

How to scale SQL azure?

I want to host my WCF services in the Azure clouds for scalability reasons. For example there will be some read data action. And it will be under High Load (1000+ user/sec).
(Like in my previous question)
Also I have a limitation in 1 sec timeout for any request.
My service will be connected with SQL Azure. I chosing it because of small latency (not more than 7ms according to microsoft's benchmark)
How many concurrent connections can hold SQL Azure per instance/database?
Is there any ability to scale SQL Azure when i will reach the limit of connections per instance?
Other solutions, options for my scenario?
Thanks.

One thing to keep in mind is that you will need to make sure you are leveraging connection pooling to its maximum. Using a service account instead of different logins is an important step to ensure proper connection pooling.
Another consideration is the use of MARS. If you have many requests coming through, you may want to pool them together into a single request, hence a single connection, and return multiple resultsets. In this post I discuss how to implement one-way queuing of SQL statements; this may not work for you as-is because you may be expecting a response, but it may give you some ideas on how to implement a batch of requests to minimize the number of connections and minimize wait time.
Finally you can take a look at this tool I wrote last year to test connection/statements against SQL Azure. The tool automatically turns off connection pooling to measure the effects of concurrency. You can download it here.
Finally, I also wrote the Enzo Shard Library on codeplex. Let me know if you have any questions if you decide to investigate the library for your project. Note that the library will evolve to support the future capabilities of SQL Azure Data Federation as well.

It appears there is no direct limit to the number of connections available per SQL Azure instance, but Microsoft state that they reserve the right to throttle connections in situations where resource use is regarded as "excessive".
There's some information on this here, also details on what may happen in this situation here.
A good work-around is to consider "sharding", where you partition your data on some easily-definable criteria and have multiple databases. This does, of course, incur additional cost. A neat implementation of that is here: http://enzosqlshard.codeplex.com/
Also: Azurescope have had some interesting benchmarks here: http://azurescope.cloudapp.net/BestPractices/#ed6a21ed-ad51-4b47-b69c-72de21776f6a (unfortunately, removed early 2012)

Is there any ability to scale SQL Azure when i will reach the limit of connections per instance?
In addition to the Enzo sql sharding suggestion, there are a couple of Microsoft products/features under construction to assist with scaling SQL Azure. These are CTP (at best) but may provide some scalability options for you by allowing you to spread the load across multiple SQL Azure databases:
SQL Azure federations - http://convective.wordpress.com/2011/05/02/sql-azure-federations/
SQL Azure datasync http://www.microsoft.com/windowsazure/sqlazure/datasync/

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string