Within our company we've got a rather large serviceapplication running as a azure cloudservice. The service contains a webrole and a workerrole.
The webrole contains an MVC-application and the workerrole is running in the background. The workerrole is used to handle several large processes and a bunch smaller processes 24/7, this is checked every 5 minutes.
I've created an azure website for this application and wrote a small wrapper class which checks if configuration values needs to be taken from either the web.config file or cloud configuration files (.cscfg files). I've added the appropiate transformations to transform some extra settings and published the application to the azure website.
So far everything works good, but what i've expected a bit already indeed happened.. The workerrole isn't working anymore and is throwing errors. The first error i've seen was;
Could not load file or assembly 'Microsoft.WindowsAzure.ServiceRuntime, Version=2.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.
So ofcourse i've taken the quick solution and went 'properties > copy local' and set it to true. After publishing this to the azure website i'm getting the following error;
Could not load file or assembly 'msshrtmi, Version=2.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.
I can find out where this error is coming from, but it feels like this is the second of a whole other bunch of errors coming. On several sites I've read that azure websites just doesn't support workerroles (obviously).
This gives me a few options;
Find a solution so I can connect the azure website to the workerrole still running in the cloudservice. If this works I can drop the webrole and I'm able to connect multiple instances to one workerrole.
Find a solution to convert the workerrole to something (no idea what this possibly could be) supported by the azure website.
Forget the whole idea and stick to the cloudservice setting with the web- and workerrole.
Fragment from workerrole.cs
The facade makes a database call to check any newly added processes.
public override void Run()
{
// Only process if the web.config says we're allowed to do so.
while (true)
{
var process = Convert.ToBoolean(WebConfigurationManager.GetSetting("Process"));
try
{
if (process )
{
var username = WebConfigurationManager.GetSetting("UsernameWorkerRole");
if (string.IsNullOrEmpty(username))
{
var version = Assembly.Load("Ecare.Productie.WorkerRole").GetName().Version;
var versionString = String.Format("{0}.{1}.{2}.{3}", version.Major, version.Minor, version.Build.ToString("000"), version.Revision.ToString("00000"));
username = ApplicatieConstanten.WorkerRoleName + " " + versionString;
}
IServiceFacade serviceFacade = new ServiceFacade(username);
serviceFacade.Start();
}
}
catch (Exception ex)
{
AuditingLoggingHelper.GetLoggerInstance(ApplicatieConstanten.WorkerRoleName).Error("Exception while starting service", ex);
}
Thread.Sleep(10000);
}
// ReSharper disable once FunctionNeverReturns
}
The main reason we're doing this, is because we have VS solution with an MVC-application (the web role) and the workerrole. We're currently publishing this to an cloud service in azure. Because of the development processes we're running seperate test, acceptation and production environments. Since it's a heavy process we're running quite expensive machines in azure, but that mostly only needed for the workerrole. The webpart is lightweight. So it's mainly an idea trying to reduce costs. So the idea is to convert the webrole to an azure website (this part is working already with just a small modification to read information from the web.config instead of the cloudconfiguration). But the workerrole currently isn't working because we haven't changed anything for that yet. An colleague of mine basically said "write a wrapper for the configpart, publish the azurewebsite to 1 or more testenvironments and point them to the same workerrole". But i'm having my doubts wether this is even possible..
Did anyone else ever ran into this sort of situation and found a solution for this? Any help finding a solution is greatly appreciated!
Find a solution so I can connect the azure website to the workerrole
still running in the cloudservice. If this works I can drop the
webrole and I'm able to connect multiple instances to one workerrole.
I'm guessing that you're using some kind of queue mechanism (Azure Storage Queues or Service Bus Queues) to facilitate communication between Web and Worker Role. If that's the case, then you can continue to use the same. Your website will push messages in a queue and your worker role will poll this queue and fetch messages and work on those.
Find a solution to convert the workerrole to something (no idea what
this possibly could be) supported by the azure website.
Do take a look at Azure Webjobs. In Web Apps world, they are the counterpart of Worker Roles.
UPDATE
Based on the comments, I think you should be able to port your code to run as Web Jobs. There are two ways by which you can do it:
If you create a Continuous Web Job, then you would have to put this 10 second sleep logic in your code itself. The job will continuously be running but will only wake up every 10 seconds. Similar to your current Worker Role implementation.
You could very well take out this 10 seconds sleep logic from your code by making your Web Job as a Scheduled Web Job where you schedule to run this every 10 seconds. I would recommend going down this route as you have decoupled your scheduling logic (10 second sleep) from your application. So tomorrow if you were to increase the sleep time, you would simply change the schedule in the portal without redeploying your code.
As Gaurav pointed, the equivalent to worker roles in the App Service space is Azure WebJobs.
Regarding this problem:
So far everything works good, but what i've expected a bit already indeed happened.. The workerrole isn't working anymore and is throwing errors. The first error i've seen was;
Could not load file or assembly 'Microsoft.WindowsAzure.ServiceRuntime, Version=2.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.
So ofcourse i've taken the quick solution and went 'properties > copy local' and set it to true. After publishing this to the azure website i'm getting the following error;
Could not load file or assembly 'msshrtmi, Version=2.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.
Microsoft.WindowsAzure.ServiceRuntime is specific to Cloud Services, and will not work in Web Apps (that's why you get the msshrtmi error with the web app). If you are still running a worker role, that file is in the instance GAC, and should be in your local machine's also. That said, Microsoft.WindowsAzure.ServiceRuntime can be referenced in the worker role project, but not the web app project.
I'm guessing you are using ServiceRuntime to get some configuration setting value using:
var value = RoleEnvironment.GetConfigurationSettingValue(settingName);
You can changed it to:
var value = CloudConfigurationManager.GetSetting(settingName);
as this method reads the configuration setting value from the appropriate configuration store (from MSDN).
The best solution here is to convert the Worker Role to a WebJob as #Graurav mentioned above.
If you want to connect the Web App to the Worker Role would be to use an Azure Queue or other intermediary storage where operations could be dropped form the WebApp and picked up by the Worker Role.
Related
I am attempting to connect my application to multiple IRC channels to read incoming chat messages and send them to my users. New channels may be added or existing channels may be removed at any time during the day and the application must pick up on this in near real-time. I am currently using Microsoft Azure for my infrastructure and am using App Services for client-facing compute and Azure Functions on the App Service plan for background tasks (Not the consumption billing model).
My current implementation is in C#/.NET Core 3.1 and uses a TcpClient over an SslStream to watch each channel. I then use a StreamReader and await reader.ReadLineAsync() to watch for new messages. The problem I am running into is that neither App Services or Azure Functions seems to be an appropriate place to host a watcher like this.
At first, I tried hosting it in the Azure Function app as this clearly seems like a task for a background worker, however Azure Functions inherently want to be triggered by a specific event, run some code, and then end. In my implementation, the call to await reader.ReadLineAsync() halts processing until a message is received. In other words, the call running the watcher needs to run in perpetuity, which seems to go against the grain of an Azure Function. In my attempt, the Azure Function service eventually crashes, the host unloads, all functions on the service cease and then restart a few minutes later when the host reloads. I am unable to find any way to tell what is causing the crash. This is clearly not the solution I want. If I could find an IrcMessageTrigger Azure Function trigger, this would probably be the best option.
Theoretically, I could host the watcher in my App Service, however when I scale out I would run into a problem due to having multiple servers connecting to each channel at once. New messages would be sent to each server and my users would receive duplicates. I could probably deal with this, but the solution would probably be hacky and I feel like the real solution would be to architect it better in the first place.
Anyone have an idea? I am open to changing the code or using a different Azure service (assuming it isn't too expensive) but I will be sticking with C# and .NET Core on Azure infrastructure for this project.
Below is a part of my watcher code to provide some context.
while (client.Connected)
{
//This line will halt execution until a message is received
var data = await reader.ReadLineAsync();
if (data == null)
{
continue;
}
var dataArray = data.Split(' ');
if (dataArray[0] == "PING")
{
await writer.WriteLineAsync("PONG");
await writer.FlushAsync();
continue;
}
if (dataArray.Length > 1)
{
switch (dataArray[1])
{
case "PRIVMSG":
HandlePrivateMessage(data, dataArray);
break;
}
}
}
Thanks in advance!
Results are preliminary, but it appears that the correct approach is to use Azure WebJobs running continuously to accomplish what I am trying to achieve. I did not consider WebJobs initially because they are older technology than Azure Functions and essentially do the same work at a lower level of abstraction. In this case, however, WebJobs appear to handle a use case that Functions are not intended to support.
To learn more about WebJobs (including continuous WebJobs) and what they are capable of, see the Microsoft documentation
I have an Azure function running on a timer every few minutes that after a varied amount of time of running will begin to fail every time it runs because of an external API and hitting the restart button manually in the azure portal fixes the problem and the job works again.
Is there a way to either get an azure function to restart itself or have something externally restart an azure function via a web hook or API request or running on a timer
I have tried using Azures API Management service which can be used to restart other kinds of app services in azure but it turns out there is no functionality in the API to request a restart of an azure function, Also looked into power shell and it seems to be the same problem you can restart different app services but not azure functions
i have tried working with the API
https://learn.microsoft.com/en-us/rest/api/azure/
Example API request where you can list functions within an azure function
GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Web/sites/{name}/functions?api-version=2016-08-01
but there is no functionality to restart an azure function from what i have researched
Basically i want to Restart the Azure function as if i was to hit this button
Azure functions manual stop/start and restart buttons in azure portal
because there is a case where the job gets into a bad state every time it runs because of an external API i have no control over and hitting restart manually gets the job going again
Another way to restart your function is by using the "watchDirectories" setting in the host.json file. If your host.json looks like this:
{
"version": "2.0",
"watchDirectories": [ "Toggle" ]
}
You could toggle a restart by using following statement in a function:
System.IO.File.WriteAllText("D:/home/site/wwwroot/Toggle/restart.conf", DateTime.Now.ToString());
Looking at the logs, the function reloads as it has detected the file change in the directory:
Watched directory change of type 'Changed' detected for 'D:\home\site\wwwroot\Toggle\restart.conf'
Host configuration has changed. Signaling restart
Azure functions by their nature are called upon an event. That may be a timer, a trigger or invocation like a HTTP event. They cannot be restarted per se, i.e. if you a function throws and exception, you cannot find the specific instance and re-run it using the out of the box functionality.
However, you can engineer your way to a more reliable solution:
Replay the event that invoked the function (i.e. kick it off again)
For non-sensitive data, log the payload of the function and create a another function that can be called on demand to re-run it. I.e. you create a proxy to "re-invoke" the function.
Harden your code by implementing a retry policy. See Polly.
Add a service bus in to your architecture. Have a simple function to write the call payload to a message bus payload. Have another function to pick up the payload and process it more extensively where there may be unreliable integrations etc). That way if the call fails you can abandon and dead letter failures for later reprocessing.
Consider using Durable Function Extensions and leveraging the durable patterns, these can help make your functions code more robust and manage state.
Why don't you try below ARM API. Since Azure function also fall under App service category, sometimes this may be helpful,
https://learn.microsoft.com/en-us/rest/api/appservice/webapps/restart
I have two Azure WebJobs. The first takes an incoming message that tells it to grab a PDF and break it into individual page images and then queue another message for the second WebJob to process the individual pages. It worked fine on our QC instance, but when we tried to move to production I started getting strange errors on the second job, but not consistently. The first job runs and breaks the file into page images. That is working fine. I have confirmed that every page image gets created and every page message gets queued. However, for the second job, only some of the messages are getting processed correctly. The remaining show this error in the WebJob diagnostics:
Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.ProcessBatchPage ---> System.Data.SqlClient.SqlException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 52 - Unable to locate a Local Database Runtime installation. Verify that SQL Server Express is properly installed and that the Local Database Runtime feature is enabled.) ---> System.ComponentModel.Win32Exception: The system cannot find the file specified
But what's weird is that this error mentions the Local Database Runtime and SQL Server Express and I am not references either anywhere in my code. The system points at an Azure SQL DB. The job is ADO.Net and I have hardcoded the connection string to try to eliminate any issues with configuration based connection strings. But what's weird is that it only happens to a certain portion of the messages. The others process perfectly.
Lastly, I ran the job in debug locally (still pointing at the real queue and DB on Azure) and got the same problem. But the job outputs a console line with the job ID as the first line of the code. For those jobs that process successfully, I see this writeline. For those that fail, I never see anything. It's almost like the job is not really starting up correctly. (the failed jobs also have a really short run time 50-100ms)
I had the same issue with some jobs and I've came accross these articles to find a solution:
Transient Fault Handling (Building Real-World Cloud Apps with Azure)
Connection Resiliency / Retry Logic (EF6 onwards)
From theses articles :
Causes of transient failures :
In the cloud environment you’ll find that failed and dropped database connections happen periodically. That’s partly because you’re going through more load balancers compared to the on-premises environment where your web server and database server have a direct physical connection. Also, sometimes when you’re dependent on a multi-tenant service you’ll see calls to the service get slower or time out because someone else who uses the service is hitting it heavily. In other cases you might be the user who is hitting the service too frequently, and the service deliberately throttles you – denies connections – in order to prevent you from adversely affecting other tenants of the service.
Use smart retry/back-off logic to mitigate the effect of transient failures:
The Microsoft Patterns & Practices group has a Transient Fault Handling Application Block that does everything for you if you’re using ADO.NET for SQL Database access (not through Entity Framework). You just set a policy for retries – how many times to retry a query or command and how long to wait between tries – and wrap your SQL code in a using block :
public void HandleTransients()
{
var connStr = "some database";
var _policy = RetryPolicy.Create < SqlAzureTransientErrorDetectionStrategy(
retryCount: 3,
retryInterval: TimeSpan.FromSeconds(5));
using (var conn = new ReliableSqlConnection(connStr, _policy))
{
// Do SQL stuff here.
}
}
When you use the Entity Framework you typically aren’t working directly with SQL connections, so you can’t use this Patterns and Practices package, but Entity Framework 6 builds this kind of retry logic right into the framework. In a similar way you specify the retry strategy, and then EF uses that strategy whenever it accesses the database.
To use this feature in the Fix It app, all we have to do is add a class that derives from DbConfiguration and turn on the retry logic.
// EF follows a Code based Configuration model and will look for a class that
// derives from DbConfiguration for executing any Connection Resiliency strategies
public class EFConfiguration : DbConfiguration
{
public EFConfiguration()
{
AddExecutionStrategy(() => new SqlAzureExecutionStrategy());
}
}
This question is for Continuous Web Jobs.
Main Questions
How can we "VIEW" or programmatically "LOG" the current memory & network status of a VM running a Continuous Web Job?
Background:
Our web job is scraping some API and we keep getting 500 errors. We believe that the VM is firing too many threads for API requests - and then because of network limitations - when the responses come back, too many responses come back at the same time, overloading the VM's network limitations.
Side Questions:
How would you use MS Azure to Web scrape - and make sure you don't overload (in terms of memory + network) the VM it's running on?
(It seems that for background processing, these VMs are built for CPU calculation - not for Web/API scraping)
I'm still using the Monitoring (Classic) APIs currently. I've not found a "non-classic" version of the API, but I've also not spent much time looking. Since a web job runs as part of the Web App you'll need to monitor the web app using the tools provided in the Microsoft.WindowsAzure.Management.Monitoring.Metrics Namespace.
I found the API to be somewhat confusing, but spent sometime working with the PG to get it right. I've provided some sample code on the MSPFE github page at: https://github.com/mspfe/AzureMetricsAPISampleKit. Running the "tests" in this Solution will show you how to use the lib.
You first need to identify the web app by getting a list of them:
var webSpaceList = _webSiteClient.WebSpaces.List();
Then collect the availabile metrics:
foreach (var website in websiteList)
{
MetricDefinitionListResponse wsMetricListResponse = _metricsClient.MetricDefinitions.List(website.WebsiteResourceId, null, null);
website.MetricDefinitionsList = wsMetricListResponse.MetricDefinitionCollection;
website.MetricNamesList = new List<string>();
foreach (var metric in website.MetricDefinitionsList.Value)
{
website.MetricNamesList.Add(metric.Name);
}
MetricValueListResponse wsValueResponse = _metricsClient.MetricValues.List(website.WebsiteResourceId, website.MetricNamesList, "",
_timeGrain, _startDateTime, _endDateTime);
website.MetricValueList = wsValueResponse.MetricValueSetCollection;
}
From there you should have metric definitions and values. Sorry if this code is a little dated... but it should work.
Azure WebJobs run within your Azure App Service's web app (formerly called Websites). So, your capacity is governed by the size (and quantity) of Web App instances, whether free tier or one of the paid tiers. And you'd measure your utilization against the Web App instances.
Your side question, about how to use Azure to web scrape, is not answerable here: It's an opinion-based question with no right answer.
I'm relatively new to Windows Azure and I need to get a better appreciation of how the Azure platform handles connection string configuration settings.
Assume I have an ASP.Net web project and this has a Web.Config connection string setting like the following:
<add name="MyDb" connectionString="Data Source=NzSqlServer01;Initial Catalog=MyAzureDb;User ID=joe;Password=bloggs;"
providerName="System.Data.SqlClient" />
I use this connection string for local testing and such. Let's assume I have a ServiceConfiguration.Local.cscfg file that holds that same connection information.
Now I'm ready to deploy out to my Azure instance. My ServiceConfiguration.Cloud.cscfg file looks like this:
<Setting name="MyDb"
value="Data Source=tcp:e54wn1clij.database.windows.net;Database=MyAzureDb{0};User ID=joe.bloggs#e54wn1clij;Password=reallysecure;Trusted_Connection=False;Encrypt=True;" />
What I'm trying to get my head around is that if I have code in my web application that's looking for a connection string called "MyDb" (for example, by calling this line of code: ConfigurationManager.ConnectionStrings["CeraDb"].ConnectionString), does Azure automagically know to look for a database called MyAzureDb1 or MyAzureDb2 based on the ServiceConfiguration file's connection string, or will the web application's code simply look for whatever's in Web.Config and fail to correctly load-balance the database connections?
You'd need to call RoleEnvironment.GetConfigurationSettingValue(...) to read the one in ServiceConfiguration.Cloud.cscfg.
The advantage to using .cscfg to store settings is that you can change them at runtime without having to deploy new code. Web.config is just another file that's part of your app, so you have to deploy a new package to update it, but the settings in .cscfg can be modified in the portal or by uploading a new .cscfg file without deploying and disturbing the app itself.
Intrinsically, unless you code otherwise, all Azure instances are created equal. In your case, this means that the configuration for two or more instances of the same Web Role will be the same.
So, if you've sharded your database and want different instances to read different databases, you'll need to get clever in your startup code and create something that allocates a shard to an instance. You've access to System.Environment.MachineName which can distinguish in code between instances once they're started.
There's a few ways to do this. One might be to have a central registry in (say) table storage that keeps a log of the last-seen-time of an instance for a shard. A background process on the server periodically writes out to this log. Then, on instance start, check the last-seen-time for each shard -- if any are "stale" (significantly older than the current time less the write interval) then the instance knows it can claim that shard for itself as the old instance has died.
(There are better ways to shard, however, generally around the data your system uses -- e.g. by the largest table in your system.)