Over the weekend, our ADF solution cannot validate any more.
Error message at validation:
DF_Postcode Could not load resource 'DF_Postcode'. Please ensure no
mistakes in the JSON and that referenced resources exist. Status:
UnknownError, Possible reason: undefined
This includes triggers, pipelines, and dataflows.
We did not do any deployments between Friday and this morning. Any thoughts?
-- Update --
Possibly related, starting a data flow debug is not succesful.
-- Update 2 --
Multiple pop-ups appear when doing a shift+F5 refresh of the page. The error message itself is not very helpful.
It does appear there were few changes pushed to ADF over the weekend. However, as the error says, could you check the resources if they are intact or if any properties or values got reset! just in case to be sure to remove user configuration issue.
Check in the ADF studio for all the resources referenced in the error.
If you are using Powershell modules at any point in there, make sure you use the latest one.
Also, for quick check you can raise an issue here to get an official response.
Looking at the Service Blade in the Azure Portal I found an emerging issue listed.
Starting at 09:00 UTC, customers may experience errors using Azure
Data Factory in West Europe, using the Azure Portal UX. We are aware
of the issue and are investigating. Updates to follow in 60 minutes or
as events warrant. Workaround: Customers can manage Data Factory using
Azure Data Studio, Azure CLI or Powershell.
https://aka.ms/azuredatastudio
-- Update from Microsoft --
Summary of impact: Between approximately 06:30 UTC and 12:30 UTC on 13 Dec 2021, you were identified as a
customer using Data Factory V2 in West Europe who may have experienced intermittent errors when accessing
resources in this region.
Preliminary Root Cause: We determined a backend service, responsible for processing API requests became
unhealthy. This led to intermittent API failing calls for Azure Data Factory resources.
Mitigation: We restarted the backend service which mitigated the issue.
Related
User reported a failure of one of our Blazor Server apps an hour or so ago. When I investigated it seemed the Azure SignalR service was responding with "502 Bad Gateway" to the initial OPTIONS on the signalr hub negotiation (signalr is separate to the webapp that hosts the site)
In azure manpo, this shows for the SignalR service:
Restarting it does not succeed. Clicking "view activity logs" in the "the resource is ina failed state" banner simply brings a "Code: 'invalidRG'" message
The only significant event recently on this subscription was that it converted from a Free-Trial to Pay-as-you-go and there were some issues transitioning (upgrade done post subscription disable for lack of payment method, took some time to get it reactivated), but then everything seemed to work well for a day
There are many other services in the same resource group, apparently working fine - it's just SignalR. The "Azure status" page shows that all SignalR services are in "Good" condition.
Where does one go from here to diagnose and fix this? Is it a "pay for support from MS and ask them"?
Even though it wasn't a billing issue I wrote on the end of my billing support ticket that I'd raised to get a payment method problem sorted out during subscription upgrade. Support wrote back acknowledging a problem with the Azure SignalR service that was actively being worked on. They claimed that it was already resolved by the time they read my ticket update..
..I don't believe the staus dashboard ever showed AzSignalR as anything other than healthy so it might be that it makes sense to sign up for at least developer support level so there is a route for reporting these things. Either that or (depending on one's moral compass) raise them as billing requests (which are free) if one feels that service availability is a billing related thing (and I suppose it should be; they can't reasonably charge you for services they aren't providing, even if it is only a few cents)
rca in progress
Azure Signal R - Service availability/management operation failures - Mitigated
Resolved: An Azure service issue (Tracking ID 1L_L-NZG) impacted resources in your subscription.
Summary of impact: Between 06:00 and 14:00 UTC on 21 Jul 2021, you were identified as a customer using Azure SignalR Service who may have received failure notifications when attempting to connect or access resources. Additionally, failures may have been seen when attempting to perform service management operations - such as create, update, delete.
I have had an Azure SQL DB point in time restore running for two days. I want to cancel it as I think there is an issue. I can see the DB restoring in SSMS but can't find the deployment in my Azure Portal. Does anyone know how to cancel it? I have tried using Azure CLI but I can't see the resource.
It's called Azure Hiccups, it happened to me yesterday on Switzerland West region between 10:20 and 10:40.
I re-run it and everything was fixed.
If I check the Activity Log I can see the error:
But if I browse in the Service Health it says everything was good:
What to do in case of Azure Hiccups:
FIX: Re-run the task, hopefully it will fix the issue, like when you hit an old TV with your fist.
PREVENT: You can try to create an Activity Log alert but once again it will be based on Service Health (which says that everything is good) and not on the actual Activity Log. So you will probably miss issues like this and will discover the problem 24h later.
POST-MORTEM: You can take a screenshot of the failed task/service in the Activity Log, show it to Microsoft and ask for a refund if possible. For the future you can check the current status of Azure in the official Status page and subscribe to the RSS feed. You can browse the Azure Status History. But as I said none of the last two reports the Azure Hiccups so the screenshot of the Activity Log is still the only proof that a tree yesterday has fallen in the forest.
As Microsoft SLA says that the High availability for Azure SQL Database and SQL Managed Instance is 99.99% of the year you can start collecting those screenshot and open tickets with their support.
After dropping the Database this morning, the operation status of which was unsuccessful. The Restore has finally been canceled 8 hrs after attempting to drop the database.
Found a solution, just create a new database of the same name. And the restoring one will be replaced with the one created, then you can delete it.
As I am trying to test LUIS app, it is throwing 403 error with this message:"Out of call volume quota for LUIS.Authoring F0 pricing tier. Please retry after 9 days. To increase your call volume switch to a paid tier."
I am using the Azure Authoring resource key tier F0 which has a limit of 1 million/month, 5/second. Across all apps, we have made only 2310 API calls this month and we have not made more than 5 per second. I am not using the prediction resource key as it has a monthly limit of 10 thousand.
I got a similar error couple of days ago in which the message was to try again after 11 days but it started working later for a day and then again got the error to try after 9 days. It makes me wonder if there is any other limit on a daily or weekly basis or is this due to some other issue?
I read similar posts here but couldn't figure out a resolution. It would be great if anyone can share any insights on how to resolve this issue.
The 1 million/month is for authoring transactions only. These would be the programmatic calls to get intent lists, add applications, train applications, etc. This doesn't apply to actually testing the application through your in-portal testing. The limit for testing predictions with the authoring key is only 1,000/month. You can just create a free tier prediction resource and associate it to your LUIS app which will upgrade you to 10,000/month.
Microsoft has good documentation on LUIS Azure resources if you need additional information.
Thanks #billoverton for the response. I was using Power Automate to test and the power automate connector for LUIS was only accepting the authoring key, which is a known issue. So instead, I have directly called the API and used the prediction key and that has fixed the issue.
We are running scheduled databricks jobs on a daily basis in Azure databricks and it runs successfully on all days. But today (29th Sept 2020), the job is failing within few seconds with Internal Error. The error message is given below:
Error while fetching notebook snapshot: HTTP request failed with status: HTTP/1.1 403 Forbidden
Has anyone else faced this issue and knows how to solve this?
We were able to identify and fix the issue. The jobs were setup under a person's user id who left the organization last weekend. Since his id was not active, it didn't have access to run the job and it was failing. After changing the job owner to another user id, it ran fine
This is due to Service Disruption (Started: September 29, 2020 00:04 UTC and Resolved: September 29, 2020 04:56 UTC) from Azure Databricks.
Here are the details from Status Notification from Azure Databricks Status Page:
One of the affected Infrastructure Component: Authentication
We are investigating an issue affecting user login.
Users may observe intermittent or consistent log in failures.
Users may notice increased latency in jobs/notebooks.
The Azure Databricks Status Page provides an overview of all core Azure Databricks services. You can easily view the status of a specific service by viewing the status page. Optionally, you can also subscribe to status updates on individual service components, which sends an alert whenever the status you are subscribed to changes.
Reference: Azure Databricks Status page
About 10 days ago I created my first Azure Sql Database. I choose the Basic Plan (4.21 €/month). This database is used only for testing purpose. Today I received an email from Microsoft Azure.
Subject of the mail : Your services were disabled because you reached your spending limit
Body of the mail : Keep building in Azure by adjusting your spending limit. Your services were disabled on May 7, 2020 because you’ve reached the monthly Azure spending limit provided by your Visual Studio subscription benefit. To keep using Azure, either:
1. Wait for your monthly spending limit to reset at the start of next month, or
2. Adjust your monthly limit for a specific month or for the life of your subscription—you only pay for the extra amount you use each month.
Why did Azure changed the Pricing Plan of my database without notifying me ? Can some actions cause this ?
I know that I did an Export Data-tier Application from Microsoft SQL Server Management Studio from which I was connected to my Azure Database (I made a backup from there). I doubt this explains that.
UPDATE
As suggested by NillsF i checked the deployment history and I can confirm I choose the Basic Plan when I created the database (see below). So I still have no clue what's happening to my database.
You can check the activity log on your subscription to see who initiated the switch from Basic to Vcore. It seems strange that MSFT would have done this on your behalf.
You can also check the deployment history on your resource group to verify the tier you picked when you created the resource itself: