Two Azure web apps sending emails at very different speeds - azure

I have deployed two Azure web apps containing web jobs that perform sending of emails. It's the same code deployed with minor web.config changes and pointing to different dbs with the same structure. They use the same SMTP channel (smtp.office365.com, port 587), and server A is on a higher spec and takes 6 seconds to send an email, and server B is on a lower spec and sends an email in under a second. Both are located in South Africa North. The performance measurement is strictly around the sending of the email, so it can't be a db issue.
Both servers are operating well and I can't see any obvious performance issues. The times taken to send emails are consistently around the same speed throughout the day.
Where should I look to resolve this difference?

You can follow troubleshooting steps to understand the web apps slow performance:
Yes, it may not be a db issue.
Recheck those minor web.config changes that is pointed to different dbs.
Troubleshoot for less performance on the web apps.
Like some of the steps are Service Health checks, Monitor Azure WebJobs, Metrics.
Any program code or database queries are in-efficient.
You can enable application diagnostics and use the Application Insights Profiler.
Use AutoHeal, as it recycles the WP (worker process) on configuration modifications and to execute the request on the given time. And you can restart the web app from portal, if you need this by automatically then you can make use of this AutoHeal.
To automate this on your web app you need to add triggers in the web.config file in the root.
Hope there is no link between sending emails with the help of database, means you are not getting the data from db then sending the emails.

Related

Azure SFTP Logic App

I have an Azure Logic App that monitors an SFTP site for new files, and if it finds one, it sends a message to an Azure Queue for subsequent processing, then deletes the file. My application has grown in scale and a single logic app seems to only be grabbing 5-10 files a minute.
Is it possible to setup a second (third, fourth, etc.) Logic App that monitors the same SFTP site, without the two apps conflicting/colliding with each other. I also see that there is a "High Throughput" setting that seems interesting, but I'm not sure it is what I need. My ultimate goal is to process more files faster, and I am considering changing the Logic App out for a scheduled Web Job that monitors the SFTP site. Since I am live and files are pouring in, I am a little reluctant to change anything until I know it's safe.
Any insight would be appreciated.
Thanks!!
Logic app comes under server less architecture, IF we select the pricing model based on 'number of executions' then it impact on the performance since Microsoft allocates the resources for such kind of pricing model as shared one and which server is free up the processing. I would recommend to attache service plan to it and select the pricing model 'per minute'
One more point, If you want longer operations to be done then Azure logic app is not appropriate one but since you are connecting to enterprise integration then logic app is good choice. I would recommend to divide this functionality between logic app with Azure function OR Microsoft flow.

Monitor when Azure Web App is unloaded?

What would be the best way to monitor when our Azure web app is being unloaded when no requests have been made to the web app for a certain amount of time?
Enabling Logstream for the web server doesn't seem to reveal anything of use..
Any hints much appreciated!
You can use Azure Application Insights to create a web test that will alert you when the site is not available anymore. It will ping your site from the data centers you select and perform some action you select (mail, webhook, etc).
However, if you want your web app to stay online, you could upgrade its plan to be at least basic, and under settings enable always on.
In addition to the kim’s response:
If you are running your web app in the Standard pricing tier, Web Apps lets you monitor two endpoints from three geographic locations.
Endpoint monitoring configures web tests from geo-distributed locations that test response time and uptime of web URLs. The test performs an HTTP GET operation on the web URL to determine the response time and uptime from each location. Each configured location runs a test every five minutes.
Uptime is monitored using HTTP response codes, and response time is measured in milliseconds. A monitoring test fails if the HTTP response code is greater than or equal to 400 or if the response takes more than 30 seconds. An endpoint is considered available if its monitoring tests succeed from all the specified locations.
Web Apps also provides you with the ability to troubleshoot issues related to your web app by looking at HTTP logs, event logs, process dumps, and more. You can access all this information using our Support portal at http://.scm.azurewebsites.net/Support
The Azure App Service support portal provides you with three separate tabs to support the three steps of a common troubleshooting scenario:
-Observe current behavior
-Analyze by collecting diagnostics information and running the built-in analyzers
-Mitigate
If the issue is happening right now, click Analyze > Diagnostics > Diagnose Now to create a diagnostic session for you, which collects HTTP logs, event viewer logs, memory dumps, PHP error logs, and PHP process report.
Once the data is collected, the support portal runs an analysis on the data and provides you with an HTML report.
In case you want to download the data, by default, it would be stored in the D:\home\data\DaaS folder.
Hope this helps.

Azure Long Response Time bottleneck?

How does one diagnose performance (response time) bottlenecks in Azure?
I've got a .NET Core website on Azure that consists of a web app service and one SQL database.
I've set up load test and deployed it via the cloud to hit the website. Configuration properties for the load test agents are
4 cores
start at 10 simultaneous users, + 10 every 20 seconds, up to 150 users
5 second think time between requests
Web app resource allocation is as follows
2 instances of
4 Core, 7GB RAM (S3 Standard)
This image shows the hardware utilization during the load test (2 tests shown, around 1pm and 1:30pm)
Seems reasonable, except my response times are, in my opinion, too slow, considering the hardware isn't stressed at all. For instance, at 10 users, my response time starts at 20ms, but at 150 users (at the end of the test), I'm seeing 5 second response times.
For the last portion of the test, My requests per second was at about 50.
Database performance, at 100DTUs, doesn't seem to be a factor:
What else can I do to diagnose slow response times? If the web server hardware isn't pegged, and the database isn't even sneezing, what other knobs can I turn on Azure?
The long response time bottleneck can be caused by various reasons, for example, bandwidth restrictions, source limited, bad application design, dependency of tightly coupled component etc.More information about how to troubleshoot slow web app performance issues , please refer to the document. There are some snipped from the document.
Enable diagnostics logging for your web app.
WebApp provides diagnostic functionality for logging information from both the web server and the web application.
We can enable Detailed Error Logging, Failed Request Tracing, Web Server Logging for web server diagnostic
Use Kudu Debug console (https://. scm.azurewebsites.net/)
Kudu provides environment settings for your application, log stream, diagnostic dump
We also can use Azure Application Insights to monitor the usage and performance of our app, then we can get more detail info about request, more detail exception info, response time and so on.
If we get more detail info about application exception, request failed, server logs and application log, it will be more helpful for us to diagnose.
There are also some related articles about how to diagnostic Web App and how to use Application insights:
Enable diagnostics logging for web apps in Azure App Service
Monitor performance in web applications
Diagnose exceptions in your web apps with Application Insights
Using Search in Application Insights
Try using new relic extensions
It provides great insight to response time and lot more with the free account
You can also enable application insights on the web app. It will provide you with details on response time and other details

Azure WebJobs for Aggregation

I'm trying to figure out a solution for recurring data aggregation of several thousand remote XML and JSON data files, by using Azure queues and WebJobs to fetch the data.
Basically, an input endpoint URL of some sort would be called (with a data URL as parameter) on an Azure website/app. It should trigger a WebJobs background job (or can it continuously running and checking the queue periodically for new work), fetch the data URL and then callback an external endpoint URL on completion.
Now the main concern is the volume and its performance/scaling/pricing overhead. There will be around 10,000 URLs to be fetched every 10-60 minutes (most URLs will be fetched once every 60 minutes). With regards to this scenario of recurring high-volume background jobs, I have a couple of questions:
Is Azure WebJobs (or Workers?) the right option for background processing at this volume, and be able to scale accordingly?
For this sort of volume, which Azure website tier will be most suitable (comparison at http://azure.microsoft.com/en-us/pricing/details/app-service/)? Or would only a Cloud or VM(s) work at this scale?
Any suggestions or tips are appreciated.
Yes, Azure WebJobs is an ideal solution to this. Azure WebJobs will scale with your Web App (formerly Websites). So, if you increase your web app instances, you will also increase your web job instances. There are ways to prevent this but that's the default behavior. You could also setup autoscale to automatically scale your web app based on CPU or other performance rules you specify.
It is also possible to scale your web job independently of your web front end (WFE) by deploying the web job to a web app separate from the web app where your WFE is deployed. This has the benefit of not taking up machine resources (CPU, RAM) that your WFE is using while giving you flexibility to scale your web job instances to the appropriate level. Not saying this is what you should do. You will have to do some load testing to determine if this strategy is right (or necessary) for your situation.
You should consider at least the Basic tier for your web app. That would allow you to scale out to 3 instances if you needed to and also removes the CPU and Network I/O limits that the Free and Shared plans have.
As for the queue, I would definitely suggest using the WebJobs SDK and let the JobHost (from the SDK) invoke your web job function for you instead of polling the queue. This is a really slick solution and frees you from having to write the infrastructure code to retrieve messages from the queue, manage message visibility, delete the message, etc. For a working example of this and a quick start on building your web job like this, take a look at the sample code the Azure WebJobs SDK Queues template punches out for you.

Minimize downtime in Azure

We are experiencing a very serious unscheduled downtime of our Azure application today for what is now coming up to 9 hours. We reported to Azure support and the ops team is actively trying to fix the problem and I do not doubt that. We managed to get our application running on another "test" hosted service that we have and redirected our CNAME to point at the instance so our customers are happy, but the "main" hosted service is still unavailable.
My own "finger in the air" instinct is that the issue is network related within our data center (west europe), and indeed, later on in the day the service dash board has gone red for that region with a message to that effect. (Our application is showing as "Healthy" in the portal, but is unreachable via our cloudapp.net URL. Additionally threads within our application are logging sql connection exceptions into our storage account as it cannot contact the DB)
What is very strange, though, is that the "test" instance I referred to above is also in the same data centre and has no issues contacting the DB and it's external endpoint is fully available.
I would like to ask the community if there is anything that I could have done better to avoid this downtime? I obeyed the guidance with respect to having at least 2 roles instances per role, yet I still got burned. Should I move to a more reliable data centre? Should I deploy my application to multiple data centres? How would I manage the fact that my SQL-Azure DB is in the same datacentre?
Any constructive guidance would be appreciated - being a techie, I've never had a more frustrating day being able to do nothing to help fix the issue.
There was an outage in the European data center today with respect to SQL Azure. Some of our clients got hit and had to move to another data center.
If you are running mission critical applications that cannot be down, I would deploy the application into multiple regions. DNS resolution is obviously a weak link right now in Azure, but can be worked around (if you only run a website it can be done very simply using Response.Redirects or similar)
Now, there is a data synchronization service from Microsoft that will sync up multiple SQL Azure databases. Check here. This way, you can have mirror sites up in different regions and have them be in sync with SQL Azure perspective
Also, be a good idea to employ a 3rd party monitoring service that would detect problems with your deployed instances externally. AzureWatch can notify or even deploy new nodes if you choose to, when some of the instances turn "Unresponsive"
Hope this helps
I can offer some guidance based on our experience:
Host your application in multiple data centers, complete with Sql Azure databases. You can connect each application to its data center specific Sql Server. You can also cache any external assets (images/JS/CSS) on the data center specific Windows Azure machine or leverage Azure Blog Storage. Note: Extra costs will be incurred.
Setup one-way SQL replication between your primary Sql Azure DB and the instance in the other data center. If you want to do bi-rectional replication, take a look at the MSDN site for guidance.
Leverage Azure Traffic Manager to route traffic to the data center closest to the user. It has geo-detection capabilities which will also improve the latency of your application. So you can redirect map http://myapp.com to the internal url of your data center and a user in Europe should automatically get redirected to the European data center and vice versa for USA. Note: At the time of writing this post, there is not a way to automatically detect and failover to a data center. Manual steps will be involved, once a failover is detected and failover is a complete set (i.e. you will failover both the Windows Azure AND Sql Azure instances). If you want micro-level failover, then I suggest putting all your config the in the service config file and encrypt the values so you can edit the connection string to connect instance X to DB Y.
You are all set now. I would create or install a local application to detect the availability of the site. A better solution would be to create a page to check for the availability of application specific components by writing a diagnostic page or web service and then poll it from a local computer.
HTH
As you're deploying to Azure you don't have much control about how SQL server is setup. MS have already set it up so that it is highly available.
Having said that, it seems that MS has been having some issues with SQL Azure over the last few days. We've been told that it only affected "a small number of users". At one point the service dashboard had 5 data centres affected by a problem. I had 3 databases in one of those data centres down twice for about an hour each time, but one database in another affected data centre that had no interruption.
If having a database connection is critical to your app, then the only way in the Azure environment to ensure against problems that MS haven't prepared against (this latest technical problem, earthquakes, meteor strikes) would be to co-locate your sql data in another data centre. At the moment the most practical way to do this is to use the synch framework. There is an ability to copy SQL Azure databases, but this only works within a data centre. With your data located elsewhere you could then point your app at the new database if the main one becomes unavailable.
While this looks good on paper though, this may not have helped you with the latest problem as it did affect multiple data centres. If you'd just been making database copies on a regular basis, that might have been enough to get you through. Or not.
(I would have posted this answer on server fault, but I couldn't find the question)
This is just about a programming/architecture issue, but you amy also want to ask the question on webmasters.stackexchange.com
You need to find out the root cause before drawing any conclusions.
However. my guess one of two things was the problem
The ISP connectivity differs for the test system and your production system. Either they use different ISPs, or different lines from the same ISP. When I worked in a hosting company we made sure that ou IP connectivity went through at least two different ISPS who did not share fibre to our premises (and where we could, they had different physical routes to the building - the homing ability of backhoes when there's a critical piece of fibre to dig up is well proven
Your datacentre had an issue with some shared production infrastructure. These might be edge routers, firewalls, load balancers, intrusion detection systems, traffic shapers etc. These typically are also often only installed on production systems. Defences here involve understanding the architecture and making sure the provider has a (tested!) DR plan for restoring SOME service when things go pair shaped. Neatest hack I saw here was persuading an IPS (intrusion prevention system) that its own management servers were malicious. And so you couldn't reconfigure it at all.
Just a thought - your DC doesn't host any of the Wikileaks mirrors, or Paypal/Mastercard/Amazon (who are getting DDOS'd by wikileaks supporters at the moment)?

Resources