I have been trying to create a worker role using powershell, Azure Emulator and the azure node.js sdk however I have been running into problems when I try to start adding modules to by worker process.
These are the steps I have taken:
1) Run Powershell
2) Create a new azure node.js project
new-azureserviceproject
3) Add a webrole
add-azurenodewebrole
4) Add a worker role
add-azurenodeworkerrole
If I run the project at this stage
start-azureemulator -launch
The site runs fine and without any IIS errors. But when I start installing new modules into the worker role and try running it again I get windows IIS errors such as "Windows Azure Web Role Entry Point Has Stopped Working" without any more information as to why it stopped. Is anybody else encountered these errors and more importantly does anybody have any examples on how to create a worker role to run a cron job and talk to my windows azure table storage? All I want to do is run a cron job every 5 seconds to check table storage for any new updates and do something.
Any ideas?
Details of the error:
Problem Event Name: APPCRASH
Application Name: iisexpress.exe
Application Version: 8.0.8298.0
Application Timestamp: 4f620349
Fault Module Name: iiscore.dll
Fault Module Version: 8.0.8298.0
Fault Module Timestamp: 4f63b65c
Exception Code: c0000005
Exception Offset: 00021767
OS Version: 6.1.7601.2.1.0.256.28
Locale ID: 1033
Additional Information 1: f66d
Additional Information 2: f66d807b515d6b2dc6f28f66db769a01
Additional Information 3: 7b2f
Additional Information 4: 7b2f6797d07ebc2c23f2b227e779722e
Update, if I lower the instance count to 1 for both webrole and worker role then it doesn't crash, perhaps it's a problem with the azure emulator ?
There are several questions here, so let's start with the first. A decent sample for using a worker role that adds modules (socket.io) can be found here:
https://www.windowsazure.com/en-us/develop/nodejs/tutorials/app-using-socketio/
Next up is of course the conversation about modules on Windows. Some modules with binary dependencies don't run on Windows. That has gotten to be a pretty small number, but it is still a possibility. You should see if you can run your worker role code outside of the emulator to validate this.
Next up we should consider this process. You would typically push changes that require action into a Storage Queue from your web role and pull from the at queue in your worker role. If you have a "cron module" then pull the top item from the queue when the timer event is fired. You can always do sleeps here, but that kind of blocking is frowned on in the node world.
This may not be related but I thought I should mention it. I ran into issues because the default version of NodeJS seemed to be too old to work with the modules I was using. You may need to change the version of NodeJS. To see the list of available versions:
Get-AzureServiceProjectRoleRuntime
Then, apply a specific version (example):
Set-AzureServiceProjectRole [Role_Name] Node 0.10.21
Related
I'm running an Azure function in Azure, the function gets triggered by a file being uploaded to blob storage container. The function detects the new blob (file) but then outputs the following message - Did not find any initialized language workers.
Setup:
Azure function using Python 3.6.8
Running on linux machine
Built and deployed using azure devops (for ci/cd capability)
Blob Trigger Function
I have run the code locally using the same blob storage container, the same configuration values and the local instance of the azure function works as expected.
The functions core purpose is to read in the .xml file uploaded into blob storage container and parse and transform the data in the xml to be stored as Json in cosmos db.
I expect the process to complete like on my local instance with my documents in cosmos db, but it looks like the function doesn't actually get to process anything due to the following error:
Did not find any initialized language workers
Troy Witthoeft's answer was almost certainly the right one at the time the question was asked, but this error message is very general. I've had this error recently on runtime 3.0.14287.0. I saw the error on many attempted invocations over about 1 hour, but before and after that everything worked fine with no intervention.
I worked with an Azure support engineer who gave some pointers that could be generally useful:
Python versions: if you have function runtime version ~3 set under the Configuration blade, then the platform may choose any of python versions 3.6, 3.7, or 3.8 to run your code. So you should test your code against all three of these versions. Or, as per that link's suggestion, create the function app using the --runtime-version switch to specify a specific python version.
Consumption plans: this error may be related to a consumption-priced app having idled off and taking a little longer to warm back up again. This depends, of course, on the usage pattern of the app. (I infer (but the Engineer didn't say this) that perhaps if the Azure datacenter my app is in happens to be quite busy when my app wants to restart, it might just have to wait for some resources to become available.). You could address this either by paying for an always-on function app, or by rigging some kind of heartbeat process to stop the app idling for too long. (Easiest with a HTTP trigger: probably just ping it?)
The Engineer was able to see a lower-level error message generated by the Azure platform, that wasn't available to me in Application Insights: ARM authentication token validation failed. This was raised in Microsoft.Azure.WebJobs.Script.WebHost.Security.Authentication.ArmAuthenticationHandler.HandleAuthenticate() at /src/azure-functions-host/src/WebJobs.Script.WebHost/Security/Authentication/Arm/ArmAuthenticationHandler.cs. There was a long stack trace with innermost exception being: System.Security.Cryptography.CryptographicException : Padding is invalid and cannot be removed.. Neither of us were able to make complete sense of this and I'm not clear whether the responsibility for this error lies within the HandleAuthenticate() call, or outside (invalid input token from... where?).
The last of these points may be some obscure bug within the Azure Functions Host codebase, or some other platform problem, or totally misleading and unrelated.
Same error but different technology, environment, and root cause.
Technology Net 5, target system windows. In my case, I was using dependency injection to add a few services, I was getting one parameter from the environment variables inside the .ConfigureServices() section, but when I deployed I forget to add the variable to the application settings in azure, because of that I was getting this weird error.
This is due to SDK version, I would suggest to deploy fresh function App in Azure and deploy your code there. 2 things to check :
Make sure your local function app SDK version matches with Azure function app.
Check python version both side.
This error is most likely github issue #4384. This bug was identified, and a fix was released mid-june 2020. Apps running on version 3.0.14063 or greater should be fine. List of versions is here.
You can use azure application insights to check your version. KUSTO Query the logs. The exception table, azure SDK column has your version.
If you are on the dedicated App Service plan, you may be able to "pull" the latest version from Microsoft by deleting and redeploying your app. If you are on consumption plan, then you may need to wait for this bugfix to rollout to all servers.
Took me a while to find the cause as well, but it was related to me installing a version of protobuf explicitly which conflicted with what was used by Azure Functions. Fair, there was a warning about that in the docs. How I found it: went to <your app name>.scm.azurewebsites.net/api/logstream and looked for any errors I could find.
I just stopped an Application Pool in IIS. When trying to start it, IIS complains that,
The service cannot accept control messages at this time. (Exception from HRESULT: 0x80080425).
What gives? Whence did this error come?
Looking at the Event Viewer > System shows these warnings:
A worker process '1456' serving application pool 'MyAppPool' failed to stop a listener channel for protocol 'http' in the allotted time. The data field contains the error number.
A process serving application pool 'MyAppPool' suffered a fatal communication error with the Windows Process Activation Service. The process id was '10592'. The data field contains the error number.
A process serving application pool 'MyAppPool' exceeded time limits during shut down. The process id was '10516'.
This resolved itself after about 5-minutes, at which point we tried to restart the website, and received:
The World Wide Web Publish Service (W3SVC) is stopped. Web sites cannot be started unless the World Wide Web Publishing Service (W3SVC) is running.
So, we started the W3SVC service, and then we could start our website.
This helped me: just wait about a minute or two.
Wait a few minutes, then retry your operation.
Ref: https://msdn.microsoft.com/en-us/library/ms833805.aspx
The error message could result due to the following reason:
The service associated with Credential Manager does not start.
Some files associated with the application have gone corrupt.
Please follow the steps mentioned below to resolve the issue:
Method 1:
Click on the “Start”
In the text box that reads “Search Program and Files” type “Services”
Right click on “Services” and select “Run as Administrator”
In the Services Window, look for Credential Manager Service and “Stop” it.
Restart the computer and “Start” the Credential Manager Service and set it to “Automatic”.
Restart the computer and it should work fine.
Method 2:
1. Run System File Checker. Refer to the link mentioned below for additional information:
http://support.microsoft.com/kb/929833
In my case, the VS debugger was attached to the w3wp process. After detaching the debugger, I was able to restart the Application Pool
I stopped the IIS Worker Process (in task manager), and then started the IIS again.
It worked.
I killed related w3wp.exe (on a friends' advise) at task manager and it worked.
Note: Use at your own risk. Be careful picking which one to kill.
Restarting the machine worked for me but not every time.
If you are really stuck on this then follow below steps
Open Task Manager
A window will open. Click on Details tab.
Search for the process name you wanted to restart/stop.
Select process, right click on it, select End task option.
A confirmation dialog box will appear. Click on End process button.
Now try to restart your service from Services.msc window.
I forgot I had mine attached to Visual Studio debugger. Be sure to disconnect from there, and then wait a moment. Otherwise killing the process viewing the PID from the Worker Processes functionality of IIS manager will work too.
Restarting the IIS windows service (World Wide Web Publishing Service) and then starting the application pool has worked for me. However, as the top answer suggests it may have just been the waiting that caused it to subsequently work.
I had this issue recently,
Problem statement:
Mine was a windows service that I run locally by attaching VS debugger. When I stop debugging and try to restart/stop the service (under services.msc) I used to get the mentioned error.
Solution:
Open up Task manager.
Search for the service (based on the exe name and not service name, for those that are different).
Kill the service.
On doing the above the service is stopped.
Being impatient, I created a new App Pool with the same settings and used that.
I kept having this problem whenever I tried to start an app pool more than once. Rather than rebooting, I simply run the Application Information Service. (Note: This service is set to run manually on my system, which may be the reason for the problem.) From its description, it seems obvious that it is somehow involved:
Facilitates the running of interactive applications with additional administrative privileges. If this service is stopped, users will be unable to launch applications with the additional administrative privileges they may require to perform desired user tasks.
Presumably, IIS manager (as well as most other processes running as an administrator) does not maintain admin privileges throughout the life of the process, but instead request admin rights from the Application Information service on a case-by-case basis.
Source: social.technech.microsoft.com
I am trying to deploy a Cloud Service with 1 Web Role to Azure.
When I do so, I get this message:
Your role instances have recycled a number of times during an update or upgrade operation. This indicates that the new version of your service or the configuration settings you provided when configuring the service prevent the role instances from running. Verify your code does not throw unhandled exceptions and that your configuration settings are correct and then start another update or upgrade operation.
The project runs just fine locally, and I'm having a hard time figuring out how to start debugging this issue. Are there any common problems that cause this message or steps to figure out what is causing it?
See https://learn.microsoft.com/en-us/archive/blogs/kwill/windows-azure-paas-compute-diagnostics-data. This will walk through all of the diagnostic data available as well as how to troubleshoot the most common issues.
We also had this annoying problem and in our case:
We use local storage, but it wasn't defined in service definition (or Worker Role's properties)
Our worker role project has reference to a service project which has reference to data layer project. But, the worker role project doesn't have reference to the data layer project. As soon as we added reference to data layer project in worker role project, it deploys successfully.
Problem #1 can be easily noticed if you first run the project in your local machine. Exception will be thrown.
Problem #2, however, is more difficult, mainly because it runs just fine in local machine. After 5 days of trouble shooting, we finally found the problem. So, check all references and try to add sub-reference projects, those that are referenced by other references.
We had similar problem, and it was due to some DDLs failed to load. (due to different version from the one MS have deployed to the VM)
Try to set CopyLocal to "true" for all the References in the project, and re-deploy.
I would either remote desktop to the cloud instance and review the Windows Event Logs for exceptions or redeploy with IntelliTrace Enabled. If you choose the later, you can download the IntelliTrace logs from Visual Studio and debug
http://msdn.microsoft.com/en-us/library/windowsazure/ff683671.aspx
One way to find out the actual error is to click on the " 1 instance" at the top of Dashboard after trying to deploy your web role. It will tell you the status of the role instance. The status should include more information about the type of error which blocks your deployment.
It depends on what your case is. For me, the status claimed that I had an unhandled Security exception. After some investigation, it turned out that under my role's OnStart(), I tried to create a event source. However, Azure service doesn't have the permission to create an event source.
For more possible issues, check http://blogs.msdn.com/b/kwill/archive/2013/09/06/troubleshooting-scenario-3-role-stuck-in-busy.aspx
For me, the issue was with my SQL Azure DB firewall rules. My Azure SQL Database servers are not set to "Allow Access to Azure services", so I have to explicitly list IPs that are allowed.
I discovered this after wrapping my code in a try/catch that swallowed all exceptions, refactoring my OnStart() and RunAsync() methods, and setting all my references to Copy Local = True. None of that worked, then I saw that I had this line in my RunAsync() method:
log4net.Config.XmlConfigurator.Configure();
I am using the AdoNetAdapter for log4net and connecting to an Azure SQL DB for logging, so that led me to check the firewall rules.
For me, I had some differing version of nuget packages in my various projects. Once I consolidated everything to the same version(s), it worked fine.
With the release of Windows Azure SDK version 2.2 for Visual Studio 2012 and 2013, Now you can Remote Debug Cloud Resources within Visual Studio.
Once your cloud service is published and running live in the cloud, you can simply set a breakpoint in your local source code. This may help you in digging out what's going wrong!
I have two servers sitting behind a loadbalancer in my service tier. Both of them should be identical - IIS setup the same, AppFabric (to keep two services warmed up), app pools running under either a service account or the app pool identity. On one server, everything works. On the other server, three of my app pools (the two that AppFabric is warming up, under the service accounts, and one that's just a standard app pool with no changes made from default settings) stop running almost as soon as I start them up (sometimes on the first request).
I get five of the following error in the Application log each time I try to start one of the app pools:
There was an error during processing of the managed application service auto-start for configuration path: 'MACHINE/WEBROOT/APPHOST/Site/App'. The error message returned is: ''. The worker process will be marked unhealthy and be shutdown. The data field contains the error code.
The error code referenced is 80070005.
This is actually for the same Site/App regardless of the app pool being started (though it may change after recreating the app pools).
In the System log, I get the following warning five times before it errors (Application pool 'AppPool' is being automatically disabled due to a series of failures in the process(es) serving that application pool.):
A process serving application pool 'AppPool' reported a failure during application preloading or service loading. The process id was '2396'. Please ensure that all application preload or service settings in the application pool are configured properly. The data field contains the error number.
The error code referenced is 80004005.
The AppPool here is the one being started.
I've tried recreating; I've tried uninstalling AppFabric (but we need it, so reinstalled and still no go). I'm out of ideas. Any suggestions?
EDIT: I tried copying the applicationHost.config over from the working server, but that didn't work either..
EDIT2: One of the app pools works when running under a real user account but doesn't when running under the ApplicationPoolIdentity....
(Also, we had an issue where the site was running under 2.0 and the apps were running under 4.0. That may have resolved the ones that are running as the service accounts.)
I was just wrestling with this same problem for a few hours and found a different culprit.
I had added a new configuration section to my Web.config in a recent commit. I also added this section to a separate ERB file used by Puppet to generate a custom Web.config at the point of deployment. In this template file, I added the new section but forgot to include its declaration in <configSections>.
Once I added the declaration to the template, our app's test VMs were able to start up again and this error went away.
While the app pools for the applications were 4.0, the app pool for the site itself was 2.0, causing some of the issues. We also had inetpub on a different drive, and we had to grant access to SERVER\Users.
I'm currently working on an Azure project that works 100% locally with emulator resources. I'm now trying to deploy a worker role, but I'm running into an issue that I'm not sure how to troubleshoot.
Upon deploying the worker role in my Azure portal, the two instances continually loop through "recycling".
I can try to RDP into the role, but I only have about a minute to look around before the connection closes, I'm assuming due to the recycling.
After some searching it doesn't seem like this is a super common problem. Is there something trivial I'm overlooking that could be causing this issue? How would you go about troubleshooting this? Thank you for your time :)
In case of missing Reference you can troubleshoot this issue by:
Unzip your CSPKG file and then again unzip .CSSX file (just rename CSSX to zip) and match that everything references and static content is all there.. This way you can match what is on VM. Also in 2 minute windows when you RDP, try to look for Application event log for exception and get it because that would be the key to find the root cause.
IF you could see the exception in event log and look for the exception, you sure can find where it was generated. You can also use Intellitrace which might require you to redeploy the app.
Also there are ways were copying WinDBG and locking to the specific process you can debug it. I am not sure how much you would want to try but just copy the WinDBG to VM and use it would be enough (not sure how much experience you have with WinDBG though and how much time you would want to spent.)
Also been pestered by this role recycle issue numerous times. Here is the sequence of steps to debug persistent role recycles:
Debugging Azure Role Recycles
Enable Remote Access to your role - RDP login
Check eventvwr.msc (Windows Logs -> Application, App & Service Logs->Windows Azure)
Review the Azure text file logs across both C:\logs and c:\resources
Review custom logs in the Volume E: or F: for any custom role startup logging
Run AzureTools and attach to startup processes (download WinDBG, use Utils->Attach Debugger, select process - WaWorkerHost/WaIISHost, etc), use G to continue and watch debugger output for assemblies failing to load.
Installing Azure Debugging Tools via Powershell
PS> md c:\tools; Import-Module bitstransfer; Start-BitsTransfer http://dsazure.blob.core.windows.net/azuretools/AzureTools.exe c:\tools\AzureTools.exe; c:\tools\AzureTools.exe
If all items above fail - try using other tools in the AzureTools treasure trove - such as fusion logging, etc, this approach above will work!
WinDBG Sample Output - Failing to Locate Assembly (WaIISHost)
The most likely cause is that you have a missing assembly. One tactic to catch this is to wrap any startup processing in a master try/catch that manual logs the error to Azure storage.
If you added any referrences, check to make sure they're set to copylocal=true and that any external assets that were included in your service package were also set to be included.
From Avkash above:
Yes. this mean some issue in your Worker Role code is causing your Worker Role Host Process to crash.. If you look your fault stack you must see the function or the link from your code which generate this fault. IF you need help open a free Azure Support incident to Windows Azure Support team and they will help you.
Just a suggestion: Also Check the installable(if any)and any other references you use are 64bit.Azure VMs have 64bit OS. Once i was stuck up with this kind of problem due to 32/64 bit issues.
Are your worker roles exiting their work loop? A local recycle is very fast and you might not notice it, but spin-up time in the cloud can be long.
If the issue is caused by a startup batch file, I have stopped the loop by editing the batch file on the instance to include "exit /b 0" at the beginning. This will tell Azure that the startup was successful and you then have all the time you need to diagnose issues without the VM getting killed.