Underlying pipeline differences between Batch Endpoints & direct pipeline runs - azure-machine-learning-service

I am looking at executing some inference jobs based on a published model, but I'm not completely clear on the differences between using a pipeline directly (python v1 sdk), or using batch endpoints (python v2 sdk or http rest). I do not have a requirement for an endpoint to be online at all times, as all jobs will execute and shutdown immediately. Most inference jobs are executing off of a single dataset at a time.
I've investigated batch inference jobs and the underlying pipeline logs/execution appears to match pipelines.
Are there any other runtime differences? Cost? Are direct pipelines being phased out?
Edit: I have the ability to interact with AML via Python & HTTP REST, so both options are in play. I've found both pipeline endpoints and batch endpoints produce HTTP endpoints that can be consumed.

Related

long-running job on GCP cloud run

I am reading 10 million records from BigQuery and doing some transformation and creating the .csv file, the same .csv stream data I am uploading to SFTP server using Node.JS.
This job taking approximately 5 to 6 hrs to complete the request locally.
Solution has been delpoyed on GCP Cloud run but after 2 to 3 second cloud run is closing the container with 503 error.
Please find below configuration of GCP Cloud Run.
Autoscaling: Up to 1 container instances
CPU allocated: default
Memory allocated: 2Gi
Concurrency: 10
Request timeout: 900 seconds
Is GCP Cloud Run is good option for long running background process?
You can use a VM instance with your container deployed and perform you job on it. At the end kill or stop your VM.
But, personally, I prefer serverless solution and approach, like Cloud Run. However, Long running job on Cloud Run will come, a day! Until this, you have to deal with the limit of 60 minutes or to use another service.
As workaround, I propose you to use Cloud Build. Yes, Cloud Build for running any container in it. I wrote an article on this. I ran a Terraform container on Cloud Build, but, in reality, you can run any container.
Set the timeout correctly, take care of default service account and assigned role, and, thing not yet available on Cloud Run, choose the number of CPUs (1, 8 or 32) for the processing and speed up your process.
Want a bonus? You have 120 minutes free per day and per billing account (be careful, it's not per project!)
Update: 2021-Oct
Cloudrun supports background activities.
Configure CPU to be always-allocated if you use background activities
Background activity is anything that happens after your HTTP response has been delivered. To determine whether there is background activity in your service that is not readily apparent, check your logs for anything that is logged after the entry for the HTTP request.
Configure CPU to be always-allocated
If you want to support background activities in your Cloud Run service, set your Cloud Run service CPU to be always allocated so you can run background activities outside of requests and still have CPU access.
Is GCP Cloud Run is good option for long running background process?
Not a good option because your container is 'brought to life' by incoming HTTP request and as soon as the container responds (e.g. sends something back), Google assumes the processing of the request is finished and cuts the CPU off.
Which may explain this:
Solution has been delpoyed on GCP Cloud run but after 2 to 3 second cloud run is closing the container with 503 error.
You can try using an Apache Beam pipeline deployed via Cloud Dataflow. Using Python, you can perform the task with the following steps:
Stage 1. Read the data from BigQuery table.
beam.io.Read(beam.io.BigQuerySource(query=your_query,use_standard_sql=True))
Stage 2. Upload Stage 1 result into a CSV file on a GCS bucket.
beam.io.WriteToText(file_path_prefix="", \
file_name_suffix='.csv', \
header='list of csv file headers')
Stage 3. Call a ParDo function which will then take CSV file created in Stage 2 and upload it to the SFTP server. You can refer this link.
You may consider a serverless, event-driven approach:
configure google storage trigger cloud function running transformation
extract/export BigQuery to CF trigger bucker - this is the fastest way to get BigQuery data out
Sometimes exported data in that way may be too large not be suitable in that form for Cloud Function processing, due to restriction like max execution time (9 min currently) or memory limitation 2GB,
In that case, you can split the original data file to smaller pieces and/or push then to Pub/Sub with storage mirror
All that said we've used CF to process a billion records from building bloom filters to publishing data to aerospike under a few minutes end to end.
I will try to use Dataflow for creating .csv file from Big Query and will upload that file to GCS.

Azure Pipelines: How to block pipeline A if pipeline B is running

I have two pipelines (also called "build definitions") in azure pipelines, one is executing system tests and one is executing performance tests. Both are using the same test environment. I have to make sure that the performance pipeline is not triggered when the system test pipeline is running and vice versa.
What I've tried so far: I can access the Azure DevOps REST-API to check whether a build is running for a certain definition. So it would be possible for me to implement a job executing a script before the actual pipeline runs. The script then just checks for the build status of the other pipeline by checking the REST-API each second and times out after e.g. 1 hour.
However, this seems quite hacky to me. Is there a better way to block a build pipeline while another one is running?
If your project is private, the Microsoft-hosted CI/CD parallel job limit is one free parallel job that can run for up to 60 minutes each time, until you've used 1,800 minutes (30 hours) per month.
The self-hosted CI/CD parallel job limit is one self-hosted parallel job. Additionally, for each active Visual Studio Enterprise subscriber who is a member of your organization, you get one additional self-hosted parallel job.
And now, there isn't such setting to control different agent pool parallel job limit.But there is a similar problem on the community, and the answer has been marked. I recommend you can check if the answer is helpful for you. Here is the link.

Advise needed - Running Python code on GOOGLE CLOUD PLATFORM serverless

I have a python code which reads data from one cloud system via rest api using the requests module and then writes data back to another cloud system via rest api . This code runs anywhere from 1 to 4 hours every week. Is there a place in Google Cloud Platform , I can execute this code on a periodic basis. Sort of like a scheduled batch job . Is there a serverless option to do this in App Engine . I know about the App engine cron service but seems like it is only for calling a URL regularly . Any thoughts ? Appreciate your help.
Google Cloud Scheduler could be the tool you are looking for. As it is mentioned in its documentation:
Cloud Scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, cloud infrastructure operations, and more. You can automate everything, including retries in case of failure to reduce manual toil and intervention.
Here you have the quickstart for Cloud Scheduler, and also another tutorial for Cron jobs.
You can use the Google Genomics API pipelines.run endpoint to run a long-running job on a Google Compute Engine virtual machine and then it will destroy the machine when it's done. If your job will run for less than 24 hours and it can handle a failure, then you can use a Preemptible VM to save cost.
Pipelines: Run
https://cloud.google.com/genomics/reference/rest/v2alpha1/pipelines/run
Preemptible Virtual Machines
https://cloud.google.com/preemptible-vms/
You could use Cloud Scheduler to kick off the job
Pipelines may be preferred to trying to use one of the serverless technologies because they don't tend to handle the long running jobs as well.
You can use AI Platform Training to run any arbitrary Python package — it doesn’t have to be a machine learning job.

Scheduling Azure container instances on demand

I have tasks running on VM and the following sequence of events. For scaling purposes I need to be able to run operations on demand and possibly in parallel.
A simple sequence of events
1. Execute task
2. Task create dataset file.
3. Startup container instance (Linux)
4. In container Execute operations on data set
5. Write updated dataset
6. Vm consume dataset
Environment is Azure.
Azure files for exchanging dataset. (2,5)
PowerShell for creating and starting container.
PowerShell could be used for sequence 4
I do not wish to use platform specific event handlers as it may be necessary to port to other runtime environments. This is a simple use case which I guess many has touched on before. Anyone have any idea if HashiCorp Nomad could bring value? Any tips for other tooling which can bring added value?

Running Locust in distributed mode on Azure functions

I am building a small utility that packages Locust - performance testing tool (https://locust.io/) and deploys it on azure functions. Just a fun side project to get some hands on with the serverless craze.
Here's the git repo: https://github.com/amanvirmundra/locust-serverless.
Now I am thinking that it would be great to run locust test in distributed mode on serverless architecture (azure functions consumption plan). Locust supports distributed mode but it needs the slaves to communicate with master using it's IP. That's the problem!!
I can provision multiple functions but I am not quite sure how I can make them talk to each other on the fly(without manual intervention).
Thinking out loud:
Somehow get the IP of the master function and pass it on to the slave functions. Not sure if that's possible in Azure functions, but some people have figured a way to get an IP of azure function using .net libraries. Mine is a python version but I am sure if it can be done using .net then there would be a python way as well.
Create some sort of a VPN and map a function to a private IP. Not sure if this sort of mapping is possible in azure.
Some has done this using AWS Lambdas (https://github.com/FutureSharks/invokust). Ask that person or try to understand the code.
Need advice in figuring out what's possible at the same time keeping things serverless. Open to ideas and/or code contributions :)
Update
This is the current setup:
The performance test session is triggered by an http request, which takes in number of requests to make, the base url, and no. of concurrent users to simulate.
Locustfile define the test setup and orchestration.
Run.py triggers the tests.
What I want to do now, is to have master/slave setup (cluster) for a massive scale perf test.
I would imagine that the master function is triggered by an http request, with a similar payload.
The master will in turn trigger slaves.
When the slaves join the cluster, the performance session would start.
What you describe doesn't sounds like a good use-case for Azure Functions.
Functions are supposed to be:
Triggered by an event
Short running (max 10 minutes)
Stateless and ephemeral
But indeed, Functions are good to do load testing, but the setup should be different:
You define a trigger for your Function (e.g. HTTP, or Event Hub)
Each function execution makes a given amount of requests, in parallel or sequentially, and then quits
There is an orchestrator somewhere (e.g. just a console app), who sends "commands" (HTTP call or Event) to trigger the Function
So, Functions are "multiplying" the load as per schedule defined by the orchestrator. You rely on Consumption Plan scalability to make sure that enough executions are provisioned at any given time.
The biggest difference is that function executions don't talk to each other, so they don't need IPs.
I think the mentioned example based on AWS Lambda is just calling Lambdas too, it does not setup master-client lambdas talking to each other.
I guess my point is that you might not need that Locust framework at all, and instead leverage the built-in capabilities of autoscaled FaaS.

Resources