I am trying to retrain the model connected to an Azure SQL database. The sample code only deals with uploading CSV files.
It appears to post the data to a web service, make another call to pull back a job ID, then make a call to tell that job to start. I assume the job is created when the CSV is updated, when I call that service directly skipping the first part, it's an empty array. Calling the job start end point just gives me a 500 error with no jobID. I basically think I need to tell it to refresh from the SQL server and then retrain but I cannot find any code related to that.
I found this difficult to find in the documentation, but found this link that I was able to successfully run a batch execution without uploading a blob. https://learn.microsoft.com/en-us/azure/machine-learning/studio/web-services-that-use-import-export-modules
The key part is:
Copy and paste the C# sample code into your Program.cs file, and
remove all references to the blob storage (NOTE: you probably still want to have the blob storage for your output results, but not for uploading data)
Locate the request declaration and update the values of Web Service
Parameters that are passed to the Import Data and Export Data
modules. In this case, you use the original query, but define a new
table name.
To configure the Web Service Parameters for the import query and the destination table:
In the properties pane for the Import Data module, click the icon at the top right of the Database query field and select Set as web service parameter.
In the properties pane for the Export Data module, click the icon at the top right of the Data table name field and select Set as web service parameter.
At the bottom of the Export Data module properties pane, in the Web Service Parameters section, click Database query and rename it Query.
Click Data table name and rename it Table.
var request = new BatchExecutionRequest()
{
GlobalParameters = new Dictionary<string, string>() {
{ "Query", #"select [age], [workclass], [fnlwgt], [education], [education-num], [marital-status], [occupation], [relationship], [race], [sex], [capital-gain], [capital-loss], [hours-per-week], [native-country], [income] from dbo.censusdata" },
{ "Table", "dbo.ScoredTable2" },
}
};
Once you have the database as your load data module source, you need not have a web service input on the training module. You can also set a database query as a web parameter. Once you run the batch execution job to retrain the models, you can store them in Azure blob storage, and have your predictive model load them from there at runtime using "load trained model" modules rather than trained model modules. See this link for that procedure:
https://blogs.technet.microsoft.com/machinelearning/2017/06/19/loading-a-trained-model-dynamically-in-an-azure-ml-web-service/
So in short:
Use your SQL database as the source for your import data module
Run the batch execution process to retrain the model at whatever interval you want
Save the retrained models (ilearner files) to blob storage, or at an http address that is accessable
Use the load trained model module in your predictive experiment rather than trained model.
put the path to your blob or url in the parameters of the load trained model module(s).
run, publish and test the predictive experiment with the dynamically loaded models
Note, this approach can be used if you have multiple models in your experiment.
Related
I am creating an extensive data factory work flow that will create and fill a data warehouse for multiple customers automatic, however i'm running into an error. I am going to post the questions first, since the remaining info is a bit long. Keep in mind i'm new to data factory and JSON coding.
Questions & comments
How do i correctly pass the parameter through to an Execute Pipeline activity?
How do i add said parameter to an Azure Function activity?
The issue may lie with correctly passing the parameter through, or it may lie in picking it up - i can't seem to determine which one. If you spot an error with the current setup, dont hesitate to let me know - all help is appreciated
The Error
{
"errorCode": "BadRequest",
"message": "Operation on target FetchEntries failed: Call to provided Azure function
'' failed with status-'BadRequest' and message -
'{\"Message\":\"Please pass 'customerId' on the query string or in the request body\"}'.",
"failureType": "UserError",
"target": "ExecuteFullLoad"
}
The Setup:
The whole setup starts with a function call to get new customers from an online economic platform. It the writes them to a SQL table, from which they are processed and loaded into the final table, after which a new pipeline is executed. This process works perfectly. From there the following pipeline is executed:
As you can see it all works well until the ForEach loop tries to execute another pipeline, that contains an azure function that calls a .NET scripted function that fills said warehouse (complex i know). This azure function needs a customerid to retrieve tokens and load the data into the warehouse. I'm trying to pass those tokens from the InternalCustomerID lookup through the ForEach into the pipeline and into the function. The ForEach works actually, but fails "Because an inner activity failed".
The Execute Pipeline task contains the following settings, where i'm trying to pass the parameter through which comes from the foreach loop. This part of the process also works, since it executes twice (as it should in this test phase):
I dont know if it doesn't successfully pass the parameter through or it fails at adding it to the body of the azure function.
The child pipeline (FullLoad) contains the following parameters. I'm not sure if i should set a default value to be overwritten or how that actually works. The guides i've look at on the internet havent had a default value.
Finally there is the settings for the Azure function. I'm not sure what i need to write in order to correctly capture the parameter and/or what to fill in - if it's the header or the body regarding the error message. I know a post cannot be executed without a body.
If i run this specific funtion by hand (using the Function App part of portal.azure.com) it works fine, by using the following settings:
I viewed all of your detailed question and I think the key of the issue is the format of Azure Function Request Body.
I'm afraid this is incorrect. Please see my below steps based on your description:
Work Flow:
Inside ForEach Activity, only one Azure Function Activity:
The preview data of LookUp Activity:
Then the configuration of ForEach Activity: #activity('Lookup1').output.value
The configuration of Azure Function Activity: #json(concat('{"name":"',item().name,'"}'))
From the azure function, I only output the input data. Sample Output as below:
Tips: I saw your step is executing azure function in another pipeline and using Execute Pipeline Activity, (I don't know why you have to follow such steps), but I think it doesn't matter because you only need to focus on the Body format, if your acceptable format is JSON, you could use #json(....),if the acceptable format is String, you could use #cancat(....). Besides, you could check the sample from the ADF UI portal which uses pipeline().parameters
I am deploying 50 NLP models on Azure Container Instances via the Azure Machine Learning service. All 50 models are quite similar and have the same input/output format with just the model implementation changing slightly.
I want to write a generic score.py entry file and pass in the model name as a parameter. The interface method signature does not allow a parameter in the init() method of score.py, so I moved the model loading into the run method. I am assuming the init() method gets run once whereas Run(data) will get executed on every invocation, so this is possibly not ideal (the models are 1 gig in size)
So how can I pass in some value to the init() method of my container to tell it what model to load?
Here is my current, working code:
def init():
def loadModel(model_name):
model_path = Model.get_model_path(model_name)
return fasttext.load_model(model_path)
def run(raw_data):
# extract model_name from raw_data omitted...
model = loadModel(model_name)
...
but this is what I would like to do (which breaks the interface)
def init(model_name):
model = loadModel(model_name)
def loadModel(model_name):
model_path = Model.get_model_path(model_name)
return fasttext.load_model(model_path)
def run(raw_data):
...
If you're looking to use the same deployed container and switch models between requests; it's not the preferred design choice for Azure machine learning service, we need to specify the model name to load during build/deploy.
Ideally, each deployed web-service endpoint should allow inference of one model only; with the model name defined before the container the image starts building/deploying.
It is mandatory that the entry script has both init() and run(raw_data) with those exact signatures.
At the moment, we can't change the signature of init() method to take a parameter like in init(model_name).
The only dynamic user input you'd ever get to pass into this web-service is via run(raw_data) method. As you have tried, given the size of your model passing it via run is not feasible.
init() is run first and only once after your web-service deploy. Even if init() took the model_name parameter, there isn't a straight forward way to call this method directly and pass your desired model name.
But, one possible solution is:
You can create params file like below and store the file in azure blob storage.
Example runtime parameters generation script:
import pickle
params = {'model_name': 'YOUR_MODEL_NAME_TO_USE'}
with open('runtime_params.pkl', 'wb') as file:
pickle.dump(params, file)
You'll need to use Azure Storage Python SDK to write code that can read from your blob storage account. This also mentioned in the official docs here.
Then you can access this from init() function in your score script.
Example score.py script:
from azure.storage.blob import BlockBlobService
import pickle
def init():
global model
block_blob_service = BlockBlobService(connection_string='your_connection_string')
blob_item = block_blob_service.get_blob_to_bytes('your-container-name','runtime_params.pkl')
params = pickle.load(blob_item.content)
model = loadModel(params['model_name'])
You can store connection strings in Azure KeyVault for secure access. Azure ML Workspaces comes with built-in KeyVault integration. More info here.
With this approach, you're abstracting runtime params config to another cloud location rather than the container itself. So you wouldn't need to re-build the image or deploy the web-service again. Simply restarting the container will work.
If you're looking to simply re-use score.py (not changing code) for multiple model deployments in multiple containers then here's another possible solution.
You can define your model name to use in web-service in a text file and read it in score.py. You'll need to pass this text file as a dependency when setting up the image config.
This would, however, need multiple params files for each container deployment.
Passing 'runtime_params.pkl' in dependencies to your image config (More detail example here):
image_config = ContainerImage.image_configuration(execution_script="score.py",
runtime="python",
conda_file="myenv.yml",
dependencies=["runtime_params.pkl"],
docker_file="Dockerfile")
Reading this in your score.py init() function:
def init():
global model
with open('runtime_params.pkl', 'rb') as file:
params = pickle.load(file)
model = loadModel(params['model_name'])
Since your creating a new image config with this approach, you'll need to build the image and re-deploy the service.
I'm using DocumentDB Data Migration Tool to migrate a documentDB db to a newly created documentDB db. The connectionStrings verify say it is ok.
It doesn't work (no data transferred (=0) but not failure written in the log file (Failed = 0).
Here is what is done :
I've tried many things such as :
migrate / transfer a collection to a json file
migrate to partitionned / non partitionned documentdb db
for the target indexing policy I've taken the source indexing policy (json got from azure, documentdb db collection settings).
...
Actually nothing's working, but I have no error logs, maybe a problem of documentdb version ?
Thanx in advance for your help.
After debugging the solution from the tool's repo I figure the tools fail silently if you mistyped the database's name like I did.
DocumentDBClient just returns an empty async enumerator.
var database = await TryGetDatabase(databaseName, cancellation);
if (database == null)
return EmptyAsyncEnumerator<IReadOnlyDictionary<string, object>>.Instance;
I can import from an Azure Cosmos DB DocumentDB API collection using DocumentDB Data Migration tool.
Besides, based on my test, if the name of the collection that we specify for Source DocumentDB is not existing, no data will be transferred and no error logs is written.
Import result
Please make sure the source collection that you specified is existing. And if possible, you can try to create a new collection and import data from this new collection, and check if data can be transferred.
I've faced same problem and after some investigation found that internal document structure was changed. Therefor after migration with with tool documents are present but couldn't be found with data explorer (but with query explorer using select * they are visible)
I've migrated collection through mongo api using Mongichef
#fguigui: To help troubleshoot this, could you please re-rerun the same data migration operation using the command line option? Just launch dt.exe from the same folder as Data Migration Tool for syntax required. Then after you launch it with required parameters, please paste the output here and I'll take a look what's broken.
I modified sample CloudRecog code for my own code. I created cloud database and get AccessKeys then copied this keys to CloudReco.cpp file. What should i use for metadata. I didn't understand this. Then when i was reading sample code i saw this line: private static final String mServerURL = "https://ar.qualcomm.at/samples/cloudreco/json/". How to get my metaData url?
The Vuforia Cloud Recognition Service enables new types of applications in retail and publishing. An application using Cloud Recognition will be able to query a Cloud Database with camera images (actual recognition happens in the cloud), and then handle the matching results returned from the cloud to perform local detection and tracking.
Also, every Cloud Image Target can optionally have an associated Metadata; a target metadata is essentially nothing else than a custom user-defined blob of data that can be associated to a target and filled with custom information, as long as the data size does not exeed the allowed limits (up to 1MB per target).
Therefore, you can use the metadata as a way to store additional content that relates to a specific target, that your application will be able to process using some custom logic.
For example, your application may use the metadata to store:
a simple text message that you want your app to display on the screen of your device when the target is detected, for example:
“Hello, I am your cloud image target XYZ, you have detected me :-) !”
a simple URL string (for instance “http://my_server/my_3d_models/my_model_01.obj”) pointing to a custom network location where you have stored some other content, like a 3D model, a video, an image, or any other custom data, so that for each different image target, your application may use such URL to download the specific content;
more in general, some custom string that your application is able to process and use to perform specific actions
a full 3D model (not just the URL pointing to a model on a server, but the model itself), for example the metadata itself could embed an .OBJ 3D model, provided that the size does not exceed the allowed limits (up to 1MB)
and more ...
How do I create/store metadata for a Cloud target ?
Metadata can be uploaded together with an image target at the time you create the target itself in your Cloud Database; or you can also update the metadata of an existing target, at a later time; in either case, you can use the online TargetManager, as explained here:
https://developer.vuforia.com/resources/dev-guide/managing-targets-cloud-database-using-target-manager
or you can proceed programmatically using the VWS API, as explained here:
https://developer.vuforia.com/resources/dev-guide/managing-targets-cloud-database-using-developer-api
How can I get the metadata of a Cloud target when it is recognized ?
The Vuforia SDK offers a dedicated API to retrieve the metadata of a target in your mobile application. When a Cloud target is detected (recognized), a new TargetSearchResult is reported to the application, and the metadata can be obtained using one of these methods:
Vuforia Native SDK - C++ API: TargetSearchResult::getMetaData() - const char*
Vuforia Native SDK - Java API: TargetSearchResult.getMetaData() - String
Vuforia Unity Extension - C# API: TargetSearchResult.Metadata - string
See also the API reference pages:
https://developer.vuforia.com/resources/api/classcom_1_1qualcomm_1_1vuforia_1_1_target_search_result
https://developer.vuforia.com/resources/api/unity/struct_target_finder_1_1_target_search_result
Sample code:
For a reference sample code in native Android, see the code in the Books.java in the "Books-2-x-y" sample project.
For a reference sample code in native iOS, see the code in the BooksEAGLView.mm file in the "Books-2-x-y" sample project.
For a reference sample code in Unity, see the CloudRecoEventHandler.cs script (attached to theCloudRecognition prefab) in the Books sample; in particular, the OnNewSearchResult method shows how to get a targetSearchResult object (from which you can then get the metadata, as shown in the example code).
EDIT: this is in response to the first part of your question,: "What should i use for metadata" (not the second part about how to find the URL)
Based on their documentation (https://developer.vuforia.com/resources/dev-guide/cloud-targets):
The metadata is passed to the application whenever the Cloud Reco
target is recognized. It is up to the developer to determine the
content of this metadata – Vuforia treats it as a blob and just passes
it along to the application. The maximum size of the uploadable
metadata is 150kByte.
I added some debugging in their CloudRecognition app and saw that the payload (presumably the meta-data) they return when "recognizing" an image is:
{
"thumburl": "https://developer.vuforia.com/samples/cloudreco/thumbs/01_thumbnail.png",
"author": "Karina Borland",
"your price": "43.15",
"title": "Cloud Recognition in Vuforia",
"average rating": "4",
"# of ratings": "41",
"targetid": "a47d2ea6b762459bb0aed1ae9dbbe405",
"bookurl": "https://developer.vuforia.com/samples/cloudreco/book1.php",
"list price": "43.99"
}
The MetaData, uploaded along with your image-target in the CloudReco database, is a .txt-file, containing whatever you want.
What pherris linked, as payload from the sample-application, is in fact the contents of a .json-file that the given image-target's metadata links to.
In the sample application, the structure is as follows:
The application activates the camera and recognizes an image-target
The application then requests that specific image-target's metadata
In this case, the metadata in question is a .txt-file with the following content:
http://www.link-to-a-specific-json-file.com/randomname.json
The application then requests the contents of that specific .json-file
The specific .json-file looks as the copy-pasted text-data that pherris linked
The application uses the text-data from the .json-file to fill out the actual content of the sample application
I'm trying to deploy an application on Azure but I'm facing some problems.
on my dev box, all works fine but I have a problem when I'm trying to use the application once it is deployed.
on the dev box, I have an action that I do manually wich crates the test tables in my local sql server express.
but I do not know how to create the tables on the server ? so when I run my website application, it says TableNotFound.
Can sy guide me through this final step ? do I need to make sg additional ?
Thx in advance
The table storage client provides a method to create the schema in the cloud storage; I forget the name (will look it up in a second); call that when you initialise whatever you're using as your data service layer.
Edit: The following snippet is what I use:
StorageAccountInfo = StorageAccountInfo.GetDefaultTableStorageAccountFromConfiguration();
TableStorage.CreateTablesFromModel( typeof( <Context> ), info );
where <Context> is your data context object.