AWS - Neptune restore from snapshot using SDK - python-3.x

I'm trying to test restoring Neptune instances from a snapshot using python (boto3). Long story short, we want to spin up and delete the Dev instance daily using automation.
When restoring, my restore seems to only create the cluster without creating the attached instance. I have also tried creating an instance once the cluster is up and add to the cluster, but that doesn't work either. (ref: client.create_db_instance)
My code does as follows, get the most current snapshot. Use that variable to create the cluster so the most recent data is there.
import boto3
client = boto3.client('neptune')
response = client.describe_db_cluster_snapshots(
DBClusterIdentifier='neptune',
MaxRecords=100,
IncludeShared=False,
IncludePublic=False
)
snaps = response['DBClusterSnapshots']
snaps.sort(key=lambda c: c['SnapshotCreateTime'], reverse=True)
latest_snapshot = snaps[0]
snapshot_ID = latest_snapshot['DBClusterSnapshotIdentifier']
print("Latest snapshot: " + snapshot_ID)
db_response = client.restore_db_cluster_from_snapshot(
AvailabilityZones=['us-east-1c'],
DBClusterIdentifier='neptune-test',
SnapshotIdentifier=snapshot_ID,
Engine='neptune',
Port=8182,
VpcSecurityGroupIds=['sg-randomString'],
DBSubnetGroupName='default-vpc-groupID'
)
time.sleep(60)
db_instance_response = client.create_db_instance(
DBName='neptune',
DBInstanceIdentifier='brillium-neptune',
DBInstanceClass='db.r4.large',
Engine='neptune',
DBSecurityGroups=[
'sg-string',
],
AvailabilityZone='us-east-1c',
DBSubnetGroupName='default-vpc-string',
BackupRetentionPeriod=7,
Port=8182,
MultiAZ=False,
AutoMinorVersionUpgrade=True,
PubliclyAccessible=False,
DBClusterIdentifier='neptune-test',
StorageEncrypted=True
)
The documentation doesn't help much at all. It's very good at providing the variables needed for basic creation, but not the actual instance. If I attempt to create an instance using the same Cluster Name, it either errors out or creates a new cluster with the same name appended with '-1'.

If you want to programmatically do a restore from snapshot, then you need to:
Create the cluster snapshot using create-db-cluster-snapshot
Restore cluster from snapshot using restore-db-cluster-from-snapshot
Create an instance in the new cluster using create-db-instance
You mentioned that you did do a create-db-instance call in the end, but your example snippet does not have it. If that call did succeed, then you should see an instance provisioned inside that cluster.
When you do a restore from Snapshot using the Neptune Console, it does steps #2 and #3 for you.
It seems like you did the following:
Create the snapshot via CLI
Create the cluster via CLI
Create an instance in the cluster, via Console
Today, we recommend restoring the snapshot entirely via the Console or entirely using the CLI.

Related

Retrieving current job for Azure ML v2

Using the v2 Azure ML Python SDK (azure-ai-ml) how do I get an instance of the currently running job?
In v1 (azureml-core) I would do:
from azureml.core import Run
run = Run.get_context()
if isinstance(run, Run):
print("Running on compute...")
What is the equivalent on the v2 SDK?
This is a little more involved in v2 than in was in v1. The reason is that v2 makes a clear distinction between the control plane (where you start/stop your job, deploy compute, etc.) and the data plane (where you run your data science code, load data from storage, etc.).
Jobs can do control plane operations, but they need to do that with a proper identity that was explicitly assigned to the job by the user.
Let me show you the code how to do this first. This script creates an MLClient and then connects to the service using that client in order to retrieve the job's metadata from which it extracts the name of the user that submitted the job:
# control_plane.py
from azure.ai.ml import MLClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential
import os
def get_ml_client():
uri = os.environ["MLFLOW_TRACKING_URI"]
uri_segments = uri.split("/")
subscription_id = uri_segments[uri_segments.index("subscriptions") + 1]
resource_group_name = uri_segments[uri_segments.index("resourceGroups") + 1]
workspace_name = uri_segments[uri_segments.index("workspaces") + 1]
credential = AzureMLOnBehalfOfCredential()
client = MLClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name,
workspace_name=workspace_name,
)
return client
ml_client = get_ml_client()
this_job = ml_client.jobs.get(os.environ["MLFLOW_RUN_ID"])
print("This job was created by:", this_job.creation_context.created_by)
As you can see, the code uses a special AzureMLOnBehalfOfCredential to create the MLClient. Options that you would use locally (AzureCliCredential or InteractiveBrowserCredential) won't work for a remote job since you are not authenticated through az login or through the browser prompt on that remote run. For your credentials to be available on the remote job, you need to run the job with user_identity. And you need to retrieve the corresponding credential from the environment by using the AzureMLOnBehalfOfCredential class.
So, how do you run a job with user_identity? Below is the yaml that will achieve it:
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
type: command
command: |
pip install azure-ai-ml
python control_plane.py
code: code
environment:
image: library/python:latest
compute: azureml:cpu-cluster
identity:
type: user_identity
Note the identity section at the bottom. Also note that I am lazy and install the azureml-ai-ml sdk as part of the job. In a real setting, I would of course create an environment with the package installed.
These are the valid settings for the identity type:
aml_token: this is the default which will not allow you to access the control plane
managed or managed_identity: this means the job will be run under the given managed identity (aka compute identity). This would be accessed in your job via azure.identity.ManagedIdentityCredential. Of course, you need to provide the chosen compute identity with access to the workspace to be able to read job information.
user_identity: this will run the job under the submitting user's identity. It is to be used with the azure.ai.ml.identity.AzureMLOnBehalfOfCredential credentials as shown above.
So, for your use case, you have 2 options:
You could run the job with user_identity and use the AzureMLOnBehalfOfCredential class to create the MLClient
You could create the compute with a managed identity which you give access to the workspace and then run the job with managed_identity and use the ManagedIdentityCredential class to create the MLClient

Azure Machine Learning compute cluster - avoid using docker?

I would like to use an Azure Machine Learning Compute Cluster as a compute target but do not want it to containerize my project. Is there a way to deactivate this "feature" ?
The main reasons behind this request is that :
I already set up a docker-compose file that is used to specify 3 containers for Apache Airflow and want to avoid a Docker-in-Docker situation. Especially that I already tried to do so but failed so far (here's the link my other related SO question).
I prefer not to use a Compute Instance as it is tied to an Azure account which is not ideal for automation purposes.
Thanks in advance !
Use the provisioning_configuration method of the AmlCompute class to specify configuration parameters.
In the following example, a persistent compute target provisioned by AmlCompute is created. The provisioning_configuration parameter in this example is of type AmlComputeProvisioningConfiguration, which is a child class of ComputeTargetProvisioningConfiguration.
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
# Choose a name for your CPU cluster
cpu_cluster_name = "cpu-cluster"
# Verify that cluster does not exist already
try:
cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
print('Found existing cluster, use it.')
except ComputeTargetException:
compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',
max_nodes=4)
cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)
cpu_cluster.wait_for_completion(show_output=True)
Refer - https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.amlcompute(class)?view=azure-ml-py

Node red instance in Kubernetes with custom settings.js and other files

I am building a service which creates on demand node red instance on Kubernetes. This service needs to have custom authentication, and some other service specific data in a JSON file.
Every instance of node red will have a Persistent Volume associated with it, so one way I though of doing this was to attach the PVC with a pod and copy the files into the PV, and then start the node red deployment over the modified PVC.
I use following script to accomplish this
def paste_file_into_pod(self, src_path, dest_path):
dir_name= path.dirname(src_path)
bname = path.basename(src_path)
exec_command = ['/bin/sh', '-c', 'cd {src}; tar cf - {base}'.format(src=dir_name, base=bname)]
with tempfile.TemporaryFile() as tar_buffer:
resp = stream(self.k8_client.connect_get_namespaced_pod_exec, self.kube_methods.component_name, self.kube_methods.namespace,
command=exec_command,
stderr=True, stdin=True,
stdout=True, tty=False,
_preload_content=False)
print(resp)
while resp.is_open():
resp.update(timeout=1)
if resp.peek_stdout():
out = resp.read_stdout()
tar_buffer.write(out.encode('utf-8'))
if resp.peek_stderr():
print('STDERR: {0}'.format(resp.read_stderr()))
resp.close()
tar_buffer.flush()
tar_buffer.seek(0)
with tarfile.open(fileobj=tar_buffer, mode='r:') as tar:
subdir_and_files = [tarinfo for tarinfo in tar.getmembers()]
tar.extractall(path=dest_path, members=subdir_and_files)
This seems like a very messy way to do this. Can someone suggest a quick and easy way to start node red in Kubernetes with custom settings.js and some additional files for config?
The better approach is not to use a PV for flow storage, but to use a Storage Plugin to save flows in a central database. There are several already in existence using DBs like MongoDB
You can extend the existing Node-RED container to include a modified settings.js in /data that includes the details for the storage and authentication plugins and uses environment variables to set the instance specific at start up.
Examples here: https://www.hardill.me.uk/wordpress/tag/multi-tenant/

Is it possible to update only part of a Glue Job using AWS CLI?

I am trying to include in my CI/CD development the update of the script_location and only this parameter. AWS is asking me to include the required parameters such as RoleArn. How can I only update the part of the job configuration I want to change ?
This is what I am trying to use
aws glue update-job --job-name <job_name> --job-update Command="{ScriptLocation=s3://<s3_path_to_script>}
This is what happens :
An error occurred (InvalidInputException) when calling the UpdateJob operation: Command name should not be null or empty.
If I add the default Command Name glueetl, this is what happens :
An error occurred (InvalidInputException) when calling the UpdateJob operation: Role should not be null or empty.
An easy way to update via CLI a glue-job or a glue-trigger is using --cli-input-json option. In order to use correct json you could use aws glue update-job --generate-cli-skeleton what returns a complete structure to insert your changes.
EX:
{"JobName":"","JobUpdate":{"Description":"","LogUri":"","Role":"","ExecutionProperty":{"MaxConcurrentRuns":0},"Command":{"Name":"","ScriptLocation":"","PythonVersion":""},"DefaultArguments":{"KeyName":""},"NonOverridableArguments":{"KeyName":""},"Connections":{"Connections":[""]},"MaxRetries":0,"AllocatedCapacity":0,"Timeout":0,"MaxCapacity":null,"WorkerType":"G.1X","NumberOfWorkers":0,"SecurityConfiguration":"","NotificationProperty":{"NotifyDelayAfter":0},"GlueVersion":""}}
Well here just fill the name of the job and change the options.
After this you have to transform your json into a one-line json and send into the command using ' '
aws glue update-job --cli-input-json '<one-line-json>'
I hope help someone with this problem too.
Ref:
https://docs.aws.amazon.com/cli/latest/reference/glue/update-job.html
https://w3percentagecalculator.com/json-to-one-line-converter/
I don't know whether you've solved this problem, but I managed using this command:
aws glue update-job --job-name <gluejobname> --job-update Role=myRoleNameBB,Command="{Name=<someupdatename>,ScriptLocation=<local_filename.py>}"
You don't need the the ARN of the role, rather the role name. The example above assumes that you have a role with the name myRoleNameBB and it has access to AWS Glue.
Note: I used a local file on my laptop. Also, the "Name" in "Command" part is also compulsory.
When I run it I go this output:
{
"JobName": "<gluejobname>"
}
Based on what I have found, there is no way to update just part of the job using the update-job API.
I ran into the same issue and I provided the role to get past this error. The command worked but the update-job API actually resets other parameters to defaults such as Type of application, Job Language,Class, Timeout, Max Capacity, etc.
So if your pre-existing job is a Spark Application in scala, it will fail as AWS defaults to Python Shell and python as job language as part of the update-job API. And this API provides no way to set job Language type to scala and set a main class (required in case of scala). It provides a way to set the application type to Spark application.
If you do not want to specify the Role to the update-job API. One approach is to copy the new script with the same name and same location that your pre-existing ETL job uses and then trigger your ETL using start-job API as part of the CI process.
Second approach is to run your ETL directly and force it to use the latest script in the start-job API call:
aws glue start-job-run --job-name <job-name> --arguments=scriptLocation="<path to your latest script>"
The only caveat with the second approach is when you look in the console the ETL job will still be referencing the old script Location. The above command just forces this run of the job to use the latest script which you can confirm by looking in the History tab on the Glue ETL console.

submit job to remote hazelcast cluster

I'm new to Hazelcast Jet and have a very basic question. I have a 3-node JET cluster set up. I have a sample code to read from Kafka and drain to an IMap. When I run it from command-line (using jet-submit.sh and use JetBootstrap.getInstance() to acquire JET client instance) it works perfectly fine. When I run the same code (using Jet.newJetClient() to acquire the instance and Run As -> Java application on Eclipse), I get:
java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field com.hazelcast.jet.core.ProcessorMetaSupplier.
Could you please let me know where am I going wrong?
One of your lambda functions captures an outside variable, probably defined at class level, and that class is not Serializable or not added to the Job config when submitting from client. This is done automatically when submitting via the script.
Please see http://docs.hazelcast.org/docs/jet/0.6.1/manual/#remember-that-a-jet-job-is-distributed
When you use a client instance to submit the job, you have to add all classes that contain the code called by the job to the JobConfig:
JobConfig config = new JobConfig();
config.addClass(...);
config.addJar(...);
...
client.newJob(pipeline, config);
For example, if you use a lambda for stage.map(), the class containing the lambda has to be added.
The jet-submit.sh script makes this easier by automatically adding the entire submitted .jar file.

Resources