Ansible to update HKEY on azure batch node - azure

As a part of ansible workflow , I am looking to update an azure batch pool windows images on runtime with ansible to disable windows update
I have created a azure batch node :
- name: Create Batch Account
azure_rm_batchaccount:
resource_group: MyResGroup
name: mybatchaccount
location: eastus
auto_storage_account:
name: mystorageaccountname
pool_allocation_mode: batch_service
I know for a fact I can use Start task in azure batch node and execute the a cmd to change Hkey to NoUpdate = 1 .
I have an ansible snippet ready :
- name: "Ensure 'Configure Automatic Updates' is set to 'Disabled'"
win_regedit:
path: HKLM:\Software\Policies\Microsoft\Windows\Windowsupdate\Au
name: "NoAutoUpdate"
data: "1"
type: dword
I would like to execute it on a run time in azure batch pool.
Does any one know how can this be archived with ansible ?

To run something on boot in a Batch pool you should simply include it as part of your start task (https://learn.microsoft.com/en-us/rest/api/batchservice/pool/add#starttask).
In this instance however you likely should just make use of the Azure functionality to turn off automatic updates https://learn.microsoft.com/en-us/rest/api/batchservice/pool/add#windowsconfiguration

Related

Retrieving current job for Azure ML v2

Using the v2 Azure ML Python SDK (azure-ai-ml) how do I get an instance of the currently running job?
In v1 (azureml-core) I would do:
from azureml.core import Run
run = Run.get_context()
if isinstance(run, Run):
print("Running on compute...")
What is the equivalent on the v2 SDK?
This is a little more involved in v2 than in was in v1. The reason is that v2 makes a clear distinction between the control plane (where you start/stop your job, deploy compute, etc.) and the data plane (where you run your data science code, load data from storage, etc.).
Jobs can do control plane operations, but they need to do that with a proper identity that was explicitly assigned to the job by the user.
Let me show you the code how to do this first. This script creates an MLClient and then connects to the service using that client in order to retrieve the job's metadata from which it extracts the name of the user that submitted the job:
# control_plane.py
from azure.ai.ml import MLClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential
import os
def get_ml_client():
uri = os.environ["MLFLOW_TRACKING_URI"]
uri_segments = uri.split("/")
subscription_id = uri_segments[uri_segments.index("subscriptions") + 1]
resource_group_name = uri_segments[uri_segments.index("resourceGroups") + 1]
workspace_name = uri_segments[uri_segments.index("workspaces") + 1]
credential = AzureMLOnBehalfOfCredential()
client = MLClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name,
workspace_name=workspace_name,
)
return client
ml_client = get_ml_client()
this_job = ml_client.jobs.get(os.environ["MLFLOW_RUN_ID"])
print("This job was created by:", this_job.creation_context.created_by)
As you can see, the code uses a special AzureMLOnBehalfOfCredential to create the MLClient. Options that you would use locally (AzureCliCredential or InteractiveBrowserCredential) won't work for a remote job since you are not authenticated through az login or through the browser prompt on that remote run. For your credentials to be available on the remote job, you need to run the job with user_identity. And you need to retrieve the corresponding credential from the environment by using the AzureMLOnBehalfOfCredential class.
So, how do you run a job with user_identity? Below is the yaml that will achieve it:
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
type: command
command: |
pip install azure-ai-ml
python control_plane.py
code: code
environment:
image: library/python:latest
compute: azureml:cpu-cluster
identity:
type: user_identity
Note the identity section at the bottom. Also note that I am lazy and install the azureml-ai-ml sdk as part of the job. In a real setting, I would of course create an environment with the package installed.
These are the valid settings for the identity type:
aml_token: this is the default which will not allow you to access the control plane
managed or managed_identity: this means the job will be run under the given managed identity (aka compute identity). This would be accessed in your job via azure.identity.ManagedIdentityCredential. Of course, you need to provide the chosen compute identity with access to the workspace to be able to read job information.
user_identity: this will run the job under the submitting user's identity. It is to be used with the azure.ai.ml.identity.AzureMLOnBehalfOfCredential credentials as shown above.
So, for your use case, you have 2 options:
You could run the job with user_identity and use the AzureMLOnBehalfOfCredential class to create the MLClient
You could create the compute with a managed identity which you give access to the workspace and then run the job with managed_identity and use the ManagedIdentityCredential class to create the MLClient

prefect.io kubernetes agent and task execution

While reading kubernetes agent documentation, I am getting confused with below line
"Configure a flow-run to run as a Kubernetes Job."
Does it mean that the process which is incharge of submitting flow and communication with api server will run as kubernetes job?
On the other side, the use case which I am trying to solve is
Setup backend server
Execute a flow composed of 2 tasks
if k8s infra available the tasks should be executed as kubernetes jobs
if docker only infra available, the tasks should be executed as docker containers.
Can somebody suggest me, how to solve above scenario in prefect.io?
That's exactly right. When you use KubernetesAgent, Prefect deploys your flow runs as Kubernetes jobs.
For #1 - you can do that in your agent YAML file as follows:
env:
- name: PREFECT__CLOUD__AGENT__AUTH_TOKEN
value: ''
- name: PREFECT__CLOUD__API
value: "http://some_ip:4200/graphql" # paste your GraphQL Server endpoint here
- name: PREFECT__BACKEND
value: server
#2 - write your flow
#3 and #4 - this is more challenging to do in Prefect, as there is currently no load balancing mechanism aware of your infrastructure. There are some hacky solutions that you may try, but there is no first-class way to handle this in Prefect.
One hack would be: you build a parent flow that checks your infrastructure resources and depending on the outcome, it spins up your flow run with either DockerRun or KubernetesRun run config.
from prefect import Flow, task, case
from prefect.tasks.prefect import create_flow_run, wait_for_flow_run
from prefect.run_configs import DockerRun, KubernetesRun
#task
def check_the_infrastructure():
return "kubernetes"
with Flow("parent_flow") as flow:
infra = check_the_infrastructure()
with case(infra, "kubernetes"):
child_flow_run_id = create_flow_run(
flow_name="child_flow_name", run_config=KubernetesRun()
)
k8_child_flowrunview = wait_for_flow_run(
child_flow_run_id, raise_final_state=True, stream_logs=True
)
with case(infra, "docker"):
child_flow_run_id = create_flow_run(
flow_name="child_flow_name", run_config=DockerRun()
)
docker_child_flowrunview = wait_for_flow_run(
child_flow_run_id, raise_final_state=True, stream_logs=True
)
But note that this would require you to have 2 agents: Kubernetes agent and Docker agent running at all times

Spawn containers on ACI using #azure/arm-containerinstance

I am working on processing data microservice. I have this microservice dockerized and now I want to deploy it. To achieve it, I am trying to manage containers in Azure Container Instances using Azure Function written in node.js.
The first thing I wanted to test is spawning containers within a group. My idea was:
const oldConfig = await client.containerGroups.get(
'resourceGroup',
'resourceName'
);
const response = await client.containerGroups.createOrUpdate(
'resourceGroup',
'resourceName',
{
osType: oldConfig.osType,
containers: [
...oldConfig.containers,
{
name: 'test',
image: 'hello-world',
resources: {
requests: {
memoryInGB: 1,
cpu: 1,
},
},
},
],
}
);
I've added osType, because docs and interface says it's required, but when I do this I receive error 'to update osType you need to remove and create group containers". When I remove osType, request is successful, but ACI does not change. I cannot recreate whole group upon every new container, because I want them to process jobs and terminate by themselves.
Not all the properties are supported to update. See the details below:
Not all container group properties can be updated. For example, to
change the restart policy of a container, you must first delete the
container group, then create it again.
Changes to these properties require container group deletion prior to
redeployment:
OS type CPU, memory, or GPU resources Restart policy Network profile
So the container group will not change after you update the osType. You need to delete the container group and create it with the changes. Get more details about the Update.

ansible handler only runs once when notified from parameterized role

I have an ansible playbook for some init services that are broadly similar with a few tweaks. In the top-level playbook, I include the role twice, like
roles:
- {role: "my-service", service: webserver}
- {role: "my-service", service: scheduler}
the my-service role has tasks, which write init scripts, and handlers, which (re)start the service. tasks/main.yml looks like this:
- name: setup init scripts
template: src=../../service-common/templates/my-service.conf dest=/etc/init/my-{{ service }}.conf
notify:
- restart my service
and handlers/main.yml has this content:
- name: restart my services
service: name=my-{{ service }} state=restarted
But after the playbook runs, we're left with only the webserver service running, and the scheduler is stop/waiting. How can I make the handler see these as two separate notifications to be handled?
The Ansible documentation states:
Handlers are lists of tasks, not really any different from regular tasks, that are referenced by a globally unique name.
So it doesn't make use of any parameters, variables, etc. when determining when/how to invoke a handler. Only the name is used.

AWS CodeDeploy with Bamboo

we develop a NodeJS application and we want to launch them in the Amazon Cloud.
We integrated „Bamboo“ in our other Atlassian applications. Bamboo transfer the build files to the S3 Bucket from Amazon.
The problem is: how I can move and start the Application from the S3 to the EC2 instances?
You can find my appspec.yml in the attachments and in my build directory are following files:
- client | files like index.html etc
- server | files like the server.js and socketio.js
- appspec.yml
- readme
Have anyone an idea? I hope it contains all important informations you need.
Thank you :D
Attachments
version: 1.0
os: linux
files:
- source: /
destination: /
Update
I just realized that your appspec.yml seems to lack a crucial part for the deployment of a Node.js application (and most others for that matter), namely the hooks section. As outlined in AWS CodeDeploy Application Specification Files, the AppSpec file is used to manage each deployment as a series of deployment lifecycle events:
During the deployment steps, the AWS CodeDeploy Agent will look up the current event's name in the AppSpec file's hooks section. [...] If
the event is found in the hooks section, the AWS CodeDeploy Agent will
retrieve the list of scripts to execute for the current step. [...]
See for example the provided AppSpec file Example (purely for illustration, you'll need to craft a custom one appropriate for your app):
os: linux
files:
- source: Config/config.txt
destination: webapps/Config
- source: source
destination: /webapps/myApp
hooks:
BeforeInstall:
- location: Scripts/UnzipResourceBundle.sh
- location: Scripts/UnzipDataBundle.sh
AfterInstall:
- location: Scripts/RunResourceTests.sh
timeout: 180
ApplicationStart:
- location: Scripts/RunFunctionalTests.sh
timeout: 3600
ValidateService:
- location: Scripts/MonitorService.sh
timeout: 3600
runas: codedeployuser
Without such an ApplicationStart command, AWS CodeDeploy does not have any instructions what to do with your app (remember that CodeDeploy is technology agnostic, thus needs to be advised how to start the app server for example).
Initial Answer
Section Overview of a Deployment within What Is AWS CodeDeploy? illustrates the flow of a typical AWS CodeDeploy deployment:
The key aspect regarding your question is step 4:
Finally, the AWS CodeDeploy Agent on each participating instance pulls the revision from the specified Amazon S3 bucket or GitHub
repository and starts deploying the contents to that instance,
following the instructions in the AppSpec file that's provided. [emphasis mine]
That is, once you have started an AWS CodeDeploy deployment, everything should work automatically - accordingly, something seems to be configured not quite right, with the most common issue being that the deployment group does not actually contain any running instances yet. Have you verified that you can deploy to your EC2 instance from CodeDeploy via the AWS Management Console?
What do you see if you log into the Deployments list of AWS CodeDeploy console?
https://console.aws.amazon.com/codedeploy/home?region=us-east-1#/deployments
(change the region accordingly)
Also the code will be downloaded in /opt/codedeploy-agent/deployment-root/<agent-id?>/<deployment-id>/deployment-archive
And the logs in /opt/codedeploy-agent/deployment-root/<agent-id?>/<deployment-id>/logs/scripts.logs
Make sure that the agent has connectivity and permissions to download the release from the S3 bucket. That means having internet connectivity and/or using a proxy in the instance (setting http_proxy so that code_deploy uses it), and setting an IAM profile in the instance with permissions to read the S3 bucket.
Check the logs of the codedeploy agent to see if it's connecting successfully or not : /var/log/aws/codedeploy-agent/codedeploy-agent.log
You need to create a deployment in code deploy and then deploy a new revision using the drop down arrow in code depoy and your S3 bucket URL. However it needs to be a zip/tar.gz/tar

Resources