Does a AzureML webservice overwrite reset the Data Collection Dataset? - azure-machine-learning-service

If we have an AzureML web service endpoint that is collecting data (for Data Drift Monitoring), does overwriting the web service endpoint with a new version of the model break links with the Dataset registered for collecting data.
The relative path to this dataset is:
<Subscription-ID>/<Resource-Group>/<Workspace>/<Webservice-Name>/<model-name>/<version>/inputs/**/inputs*.csv
If we redeploy a new version using az ml model deploy ..... --overwrite, will we need a new reference to a new Dataset for detecting Data Drift?
If we use az ml service update .., will the Dataset reference be kept intact?

Since the Dataset Asset is a simple reference to a location in a Datastore. Assuming the model version and service name does not change, the Dataset reference also will not change. If however, with every Service Update - The model version changes then adding a Dataset with Relative Path:
<Subscription-ID>/<Resource-Group>/<Workspace>/<Webservice-Name>/<model-name>/*/inputs/**/inputs*.csv
Will solve the problem. Since Data Drift is another service referencing this Dataset asset, it will keep working as expected.

Related

Additional column throwing validation issue with Azure SQL data sink in Azure Data Factory

Validation Error
I've got this weird issue where validation fails on 'additional columns' for my data sink to Azure SQL coming from a blob storage source in the Azure Data Factory GUI. No matter how many times we recreate the dataset (or specify another dataset, new) we can't get past this validation issue.
The irony of this is we deploy these pipelines from code and when we run them, we get no errors at all. This issue we have had just made life really difficult developing pipelines further as we have to do everything by code. We cant use the pipepline publish option.
Here are some screen grabs for you of the pipeline so you can see the flow.
Pipeline
Inside copyCustomer.
Source
Mapping
Sink
Any ideas on how to fix this validation would be greatly appreciated.
For what it's worth, we have recreated the dataset multiple times (clone and new) to avoid any issue with the dataset model not being the latest as per what's documented here https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview#add-additional-columns-during-copy
Sometimes by setting the table in sink to autocreate has shown the validation to be 'fixed' but then when we go to publish it errors out again.
When your Azure SQL dataset was created long time before and is still utilizing an outdated dataset model that Additional Columns do not support, this is expected behavior.
As per official Microsoft documentation
To resolve this issue, you can just follow the error message to create a new Azure SQL dataset and use this as copy sink.
I followed error message and created new data set and it is working fine for me.
Source:
Mapping:
Sink:
Output:
I suspect here, your dataset of Sink type is incorrect. I reproduced,
same at my end. Its working fine. Kindly make sure you create a sink dataset type with Azure SQL database type connector only.
Please check below screenshots from my implementation.
If still it helps, feel free to share your sink dataset type connector details along with screenshots.

How to store ONNX Machine learning Model as a Hexadecimal string for Azure Synapse?

I've already created a Model.onnx file, how can I convert in to .hex format to make prediction by SQL Script in Azure Synapse?
I read this document but still can't understand how to do it. Can anyone explain?
Train a model in synapse studio which includes machine learning libraries and apache spark. Here the major requirement is to make ONNX supportive create, register and use model.
We can deploy the ONNX model to a table in SQL database pool using Synapse studio which leads to implement complete ONNX deployment using Synapse studio without coming out of that environment and using the notebook within the environment. Please Checkout the repository .
For more information refer this link.

Releasing new Core Data Schema to iCloud production

I have an app out in the App Store, and I am working on a lightweight migration (adding new attributes and new entities, not deleting anything). From extensive research, I know that I need to add a new version of my current Core Data Model for the local version of the data model. Anyone who updates their app and only uses the local data will automatically be migrated over.
However, I can not find anything about what happens when I update the iCloud schema (from icloud.developer.apple.com). Mainly, I'm concerned about users who are on older versions of the app and are using iCloud. When I update the schema in the iCloud website, will users on an older version of the app lose their current data or not be able to sync their data since their local schema will be different from the iCloud one?
Also, I'm using an NSPersistentCloudKitContainer for syncing the Core Data with CloudKit.
Any help is greatly appreciated as I do not want to mess up anyone's data!
No, their data still be on iCloud and they could continue to use your app.
When your Schema is deployed to the Production environment, you can not change types of Records or delete them, so all your changes will be done only in addition to the current Schema settings and does not affect users, which have not updated the app yet.
I had a similar question previously and was quite anxious about updating my app Schema, but everything went well - no problems for users and no data was lost.
Do not forget to initialize your new scheme from the app and deploy changes to the Production on iCloud dashboard.
You could initialize your scheme in your AppDelegate when you initialize your NSPersistentCloudKitContainer with following code:
let options = NSPersistentCloudKitContainerSchemaInitializationOptions()
try? container.initializeCloudKitSchema(options: options)
After that you could comment out these lines until the next update of Core Data model.
You could check that all changes are uploaded in the iCloud dashboard by clicking on Deploy Schema Changes - you will see a confirmation window with all the changes to the model which will be deployed.
It is also possible to change your Scheme directly in the iCloud dashboard, but it is not so convenient (unless you need to add just one Record type).
Since changes in the Schema are not affecting existing users, I usually move them to Production before I submit the app for review, but after all testing related to new Record types is done and I am not planning to change anything there.

Cosmos DB - Microsoft.Azure.Documents.AddressResolver.EnsureRoutingMapPresent

Ive been getting some odd issues with Cosmos DB as part of a data migration. The migration consisted of deleting and recreating our production collection and then using the Azure Cosmos DB migration tool to copy documents from our Development collection.
I wanted a full purge of the data already in the production collection rather than copying the new documents on top, so to achieve this I did the following process…
Deleted the production collection, named “Production_Products”
Recreated the Production collection with the same name and partition key
Using the Azure Cosmos DB Data Migration Tool, I copied the documents from our development collection into the newly created and empty production collection “Production_Products”
Once the migration was complete we tested the website and we kept getting the following error…
Microsoft.Azure.Documents.NotFoundException: at
Microsoft.Azure.Documents.AddressResolver.EnsureRoutingMapPresent
This was very confusing as we could query the data from Azure no problem. After multiple application restarts and checking the config we created a new collection “Production_Products_Test” and repeated the migration steps.
This worked fine. Later in the day we reverted our changes by recreating a new collection with the original name “Production_Products” and that failed. We had to revert back to using the “_Test” collection.
Can anyone offer any insight into why this is happening?
Based on the comments.
The DocumentClient maintains address caches, if you delete and recreate the collection externally (not through the DocumentClient or at least, not through that particular DocumentClient instance since you describe there are many services), the issue that might arise is that the address cache that that instance has is invalid. Newer versions of the SDK contain fixes that would react and refresh the cache (see the Change log here https://learn.microsoft.com/azure/cosmos-db/sql-api-sdk-dotnet).
The SDK 2.1.3 is rather old (more than 2 years) and the recommendation would be to update it (2.10.3 is the latest at this point).
The reason for the invalidation of those caches is that when you delete and recreate, the new collection has a different ResourceId.
Having said that, there is a scenario that won't be easily fixed, and that is if when you delete and recreate a collection, your code is using ResourceIds (for example, using the SelfLinks) instead of the names/ids to do operations. In those cases, if you are caching or holding a reference to the ResourceId of the previous collection, those requests will fail. Instead, you would need to use the names/ids through UriFactory.
Normally in these cases knowing the full stack trace of the exception (not just the name of the type) helps understand what is going on exactly.

How to create a new Table using Table Storage in Azure

I have tried to use the samples that Roger Jennings recommeded in his book, "Cloud Computing with Windows Azure", but he is using version 1. I'm using v1.2 and there is a lot of differences. Firstly, I had to recompile the StorageClient DLL with the corrected namespace and other changes. Then, when I use his code to create a Table at the application start, I get an "out of range index".
Has anyone managed to successfully create a Table at application startup? If so, how? Also, if there are any tutorials/samples that use version 1.2, I'd greatly appreciate them too.
You no longer have to rebuild the sample storage client library. v1.2 will automagically add three DLL references to your role:
Microsoft.WindowsAzure.Diagnostics
Microsoft.WindowAzure.ServiceRuntime
Microsoft.WindowsAzure.StorageClient
To create a table, you'll need to first set up your table :
Create a class deriving from TableServiceEntity (say, "MyEntity")-
Derive a table class from TableServiceContext (say, "MyEntityDataServiceContext"). In that class, create a property of type DataServiceQuery < MyEntity >() that returns CreateQuery < MyEntity > ("MyEntities");
Once you've done that, create the table with code like this:
var account = CloudStorageAccount.DevelopmentStorageAccount;
CloudTableClient.CreateTablesFromModel(typeof(MyEntityDataServiceContext),account.TableEndpoint.AbsoluteUri, account.Credentials);
For a much more detailed look at this, download the Azure Platform Training Kit. There's a lab called "Exploring Windows Azure Storage" that covers all this.

Resources