How to get list of running VMs from AzureML - azure-machine-learning-service

I am a beginner with Python and with AzureML.
Currently, my task is to list all the running VMs (or Compute Instances) with status and (if running) for how long they ran.
I managed to connect to AzureML and list Subscriptions, Resource Groups and Workspaces, but I'm stuck on how to list running VMs now.
Here's the code that I have currently:
# get subscriptions list using credentials
subscription_client = SubscriptionClient(credentials)
sub_list = subscription_client.subscriptions.list()
print("Subscription ID".ljust(column_width) + "Display name")
print(separator)
for group in list(sub_list):
print(f'{group.subscription_id:<{column_width}}{group.display_name}')
subscription_id = group.subscription_id
resource_client = ResourceManagementClient(credentials, subscription_id)
group_list = resource_client.resource_groups.list()
print(" Resource Groups:")
for group in list(group_list):
print(f" {group.name}{group.location}")
print(" Workspaces:")
my_ml_client = Workspace.list(subscription_id, credentials, group.name)
for ws in list(my_ml_client):
try:
print(f" {ws}")
if ws:
compute = ComputeTarget(workspace=ws, name=group.name)
print('Found existing compute: ' + group.name)
except:()
Please note that this is more or less a learning exercise and it's not the final shape of the code, I will refactor once I get it to work.
Edit: I found an easy way to do this:
workspace = Workspace(
subscription_id=subscription_id,
resource_group=group.name,
workspace_name=ws,
)
print(workspace.compute_targets)
Edit2: If anyone stumbles on this question and is just beginning to understand Python+Azure just like I do, all this information is from official documentation (which is just hard to follow as a beginner).
The result from 'workspace.compute_targets' will contain both Compute Instances and AML Instances.
If you need to retrieve only the VMs (like I do) you need to take an extra step to filter the result like this:
if type(compute_list[vm]) == ComputeInstance:

Related

i want to list all Resource in organisations using python function in google cloud

I went through the official docs of google cloud but I don't have an idea how to use these to list resources of specific organization by providing the organization id
organizations = CloudResourceManager.Organizations.Search()
projects = emptyList()
parentsToList = queueOf(organizations)
while (parent = parentsToList.pop()) {
// NOTE: Don't forget to iterate over paginated results.
// TODO: handle PERMISSION_DENIED appropriately.
projects.addAll(CloudResourceManager.Projects.List(
"parent.type:" + parent.type + " parent.id:" + parent.id))
parentsToList.addAll(CloudResourceManager.Folders.List(parent))
}
organizations = CloudResourceManager.Organizations.Search()
projects = emptyList()
parentsToList = queueOf(organizations)
while (parent = parentsToList.pop()) {
// NOTE: Don't forget to iterate over paginated results.
// TODO: handle PERMISSION_DENIED appropriately.
projects.addAll(CloudResourceManager.Projects.List(
"parent.type:" + parent.type + " parent.id:" + parent.id))
parentsToList.addAll(CloudResourceManager.Folders.List(parent))
}
You can use Cloud Asset Inventory for this. I wrote this code for performing a sink in BigQuery.
import os
from google.cloud import asset_v1
from google.cloud.asset_v1.proto import asset_service_pb2
def asset_to_bq(request):
client = asset_v1.AssetServiceClient()
parent = 'organizations/{}'.format(os.getEnv('ORGANIZATION_ID'))
output_config = asset_service_pb2.OutputConfig()
output_config.bigquery_destination.dataset = 'projects/{}}/datasets/{}'.format(os.getEnv('PROJECT_ID'),
os.getEnv('DATASET'))
output_config.bigquery_destination.table = 'asset_export'
output_config.bigquery_destination.force = True
response = client.export_assets(parent, output_config)
# For waiting the finish
# response.result()
# Do stuff after export
return "done", 200
if __name__ == "__main__":
asset_to_bq('')
Be careful is you use it, the sink must be done in an empty/not existing table or set the force to true.
In my case, some minutes after the Cloud Scheduler that trigger my function and extract the data to BigQuery, I have a Scheduled Query into BigQuery that copy the data to another table, for keeping the history.
Note: It's also possible to configure an extract in Cloud Storage if you prefer.
I hope that is a starting point for you and for achieving what do you want to do.
I am able to list the project but I also want to list the folder and resources under folder and folder.name and tags and i also want to specify the organization id to resources information from a specific organization
import os
from google.cloud import resource_manager
def export_resource (organizations):
client = resource_manager.Client()
for project in client.list_projects():
print("%s, %s" % (project.project_id, project.status))

How to check EMR spot instance price history with boto

I'd like to create an EMR cluster programmatically using spot pricing to achieve some cost savings. To do this, I am trying to retrieve EMR spot instance pricing from AWS using boto3 but the only API available that I'm aware of from Boto3 is to use the ec2 client's decribe_spot_price_history call - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.describe_spot_price_history
The prices from EC2 are not indicative of the pricing for EMR as seen here - https://aws.amazon.com/emr/pricing/. The values are almost double that of EMR's.
Is there a way that I can see the spot price history for EMR similar to EC2? I have checked https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html and several other pages of documentation from AWS online about this and have found nothing.
Here's a code snippet that I use to check approximate pricing that I can use to bid on EMR instances.
max_bid_price = 0.140
min_bid_price = max_bid_price
az_choice = ''
response = ec2.describe_spot_price_history(
Filters=[{
'Name': 'availability-zone',
'Values': ['us-east-1a', 'us-east-1c', 'us-east-1d']
},
{
'Name': 'product-description',
'Values': ['Linux/UNIX (Amazon VPC)']
}],
InstanceTypes=['r5.2xlarge'],
EndTime=datetime.now(),
StartTime=datetime.now()
)
# TODO: Add more Subnets in other AZ's if picking from our existing 3 is an issue
# 'us-east-1b', 'us-east-1e', 'us-east-1f'
for spot_price_history in response['SpotPriceHistory']:
print(spot_price_history)
if float(spot_price_history['SpotPrice']) <= min_bid_price:
min_bid_price = float(spot_price_history['SpotPrice'])
az_choice = spot_price_history['AvailabilityZone']
The above fails since the prices for EC2 spot instances are a bit higher than what Amazon would charge for the normal hourly amount for EMR on-demand instances. (e.g. on demand for a cluster of that size only costs $0.126/hour, but on demand for EC2 is $0.504/hour and spot instances go for about $0.20/hour).
There's no such thing called EMR spot pricing, as already mentioned in the comment. Spot pricing is for EC2 instances. You can look at this AWS spot advisor page to find out which instance categories have lower interruption rate, and choose based on that.
Since 2017, AWS has changed the algorithm for spot pricing, "where prices adjust more gradually, based on longer-term trends in supply and demand", so you probably don't need to look at the historical spot prices. More details about that can be found here.
Nowadays, you're most likely gonna be fine using the last price (+ delta) for that instance. This can be achieved using the following code snippet:
def get_bid_price(instancetype, aws_region):
instance_types = [instancetype]
start = datetime.now() - timedelta(days=1)
ec2_client = boto3.client('ec2', aws_region)
price_dict = ec2_client.describe_spot_price_history(StartTime=start,
InstanceTypes=instance_types,
ProductDescriptions=['Linux/UNIX (Amazon VPC)']
)
if len(price_dict.get('SpotPriceHistory')) > 0:
PriceHistory = namedtuple('PriceHistory', 'price timestamp')
price_list = [PriceHistory(round(float(item.get('SpotPrice')), 3), item.get('Timestamp'))
for item in price_dict.get('SpotPriceHistory')]
price_list.sort(key=lambda tup: tup.timestamp, reverse=True)
# Maybe add 10 cents to the last spot price
bid_price = round(float(price_list[0][0] + .01), 3)
return bid_price
else:
raise ValueError('Invalid instance type: {} provided. '
'Please provide correct instance type.'.format(instancetype))

Easiest way to match security groups to their instances in Boto 3?

I have a python list of Security Groups and I need to find to which EC2/RDS instance or ELB they are associated. What's the easiest way to do this in Boto3?
Also, to my best understanding one security group can be attached to several instances and one EC2 instance can have several security groups attached, so I need to find a way to identify those relationships to better clean it up. What I have right is a python list of security group objects.
This is my current code:
import boto3
import json
# regions = ["us-east-1","ap-southeast-1","ap-southeast-2","ap-northeast-1","eu-central-1","eu-west-1"]
regions = ["us-east-1"]
uncompliant_security_groups = []
for region in regions:
ec2 = boto3.resource('ec2', region_name=region)
sgs = list(ec2.security_groups.all())
for sg in sgs:
for rule in sg.ip_permissions:
# Check if list of IpRanges is not empty, source ip meets conditions
if len(rule.get('IpRanges')) > 0 and rule.get('IpRanges')[0]['CidrIp'] == '0.0.0.0/0':
if rule.get('FromPort') == None:
uncompliant_security_groups.append(sg)
if rule.get('FromPort') != None and rule.get('FromPort') < 1024 and rule.get('FromPort') != 80 and rule.get('FromPort') != 443:
uncompliant_security_groups.append(sg)
print(uncompliant_security_groups)
print(len(uncompliant_security_groups))
for sec_group in uncompliant_security_groups:
If you enable AWS Config Aggregator in the account (granted you have to pay for it):
account_id = '0123456789'
region = 'us-east-2'
sg_id = 'sg-0123456789'
relationship_data = CONFIG_CLIENT.get_aggregate_resource_config(
ConfigurationAggregatorName='agg_name',
ResourceIdentifier={
'SourceAccountId': account_id,
'SourceRegion': region,
'ResourceId': sg_id,
'ResourceType': 'AWS::EC2::SecurityGroup'
}
)]
relationship_data = relationship_data['ConfigurationItem']['relationships']
print(relationship_data)
Which should return some data like:
[
{'resourceType': 'AWS::EC2::NetworkInterface', 'resourceId': 'eni-0123456789', 'relationshipName': 'Is associated with NetworkInterface'},
{'resourceType': 'AWS::EC2::Instance', 'resourceId': 'i-0123456789', 'relationshipName': 'Is associated with Instance'},
{'resourceType': 'AWS::EC2::VPC', 'resourceId': 'vpc-0123456789', 'relationshipName': 'Is contained in Vpc'}
]
NOTE: This appears to ONLY work with AWS CONFIG AGGREGATORS! I have NO idea why this is, or if the data can be obtained from aws config by itself. However my org uses aws config so this enables this type of data for me.
Boto3 config docs:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/config.html

How to save the output of Azure-cli commands in a variable

When using azure-cli in python 3.5 and calling the commands from a script I have no control on the output in the console.
When a command is executed it prints the result to the console, but I'm struggling to just take the result and put it in a variable to analyze it.
from azure.cli.core import get_default_cli
class AzureCmd():
def __init__(self, username, password):
self.username = username
self.password = password
def login(self, tenant):
login_successfull = get_default_cli().invoke(['login',
'--tenant', tenant,
'--username', self.username,
'--password', self.password]) == 0
return login_successfull
def list_vm(self, tenant):
list_vm = get_default_cli().invoke(['vm', 'list', '--output', 'json'])
print(list_vm)
tenant = 'mytenant.onmicrosoft.com'
cmd = AzureCmd('login', 'mypassword')
cmd.login(tenant)
cmd.list_vm(tenant)
Here is my my script attempt.
What I want to achieve : not getting any output when cmd.login(tenant) is executed.
Instead of getting 0 (success) or 1 (failure) in my variables login_successfull and list_vm, I want to save the output of the get_default_cli().invoke() in it.
I ran into the same problem, and found a solution, I also found out many people offered the standard solution that normally works in most cases, but they didn't verify it works for this scenario, and it turns out az cli is an edge case.
I think the issue has something to do with az cli is based on python.
Win10CommandPrompt:\> where az
C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\wbin\az.cmd
If you look in that file you'll see something like this and discover that Azure CLI is just python:
python.exe -IBm azure.cli
So to do what you want to do, try this (it works for me):
import subprocess
out = subprocess.run(['python', '-IBm', 'azure.cli', '-h'], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(out)
#this is equivalent to "az -h'
The above syntax won't work unless every single arg is a comma separated list of strings, I found a syntax I like alot more after reading how to do multiple args with python popen:
import subprocess
azcmd = "az ad sp create-for-rbac --name " + SPName + " --scopes /subscriptions/" + subscriptionid
out = subprocess.run(azcmd, shell=True, stdout=subprocess.PIPE).stdout.decode('utf-8')
print(out)
I faced the same problem while trying to save the log of Azure Container Instance. None of the above solutions worked exactly as they are. After debugging the azure cli python code
(File : \Python39\Lib\site-packages\azure\cli\command_modules\container\custom.py , function : container_logs() ), i see that the container logs are just printed to the console but not returned. If you want to save the logs to any variable, add the return line (Not exactly a great solution but works for now). Hoping MS Azure updates their azure cli in upcoming versions.
def container_logs(cmd, resource_group_name, name, container_name=None, follow=False):
"""Tail a container instance log. """
container_client = cf_container(cmd.cli_ctx)
container_group_client = cf_container_groups(cmd.cli_ctx)
container_group = container_group_client.get(resource_group_name, name)
# If container name is not present, use the first container.
if container_name is None:
container_name = container_group.containers[0].name
if not follow:
log = container_client.list_logs(resource_group_name, name, container_name)
print(log.content)
# Return the log
return(log.content)
else:
_start_streaming(
terminate_condition=_is_container_terminated,
terminate_condition_args=(container_group_client, resource_group_name, name, container_name),
shupdown_grace_period=5,
stream_target=_stream_logs,
stream_args=(container_client, resource_group_name, name, container_name, container_group.restart_policy))
With this modification and along with the above solutions given (Using the get_default_cli), we can store the log of the Azure container instance in a variable.
from azure.cli.core import get_default_cli
def az_cli(args_str):
args = args_str.split()
cli = get_default_cli()
res = cli.invoke(args)
if cli.result.result:
jsondata = cli.result.result
elif cli.result.error:
print(cli.result.error)
I think you can use the subprocess and call the az cli to get the output instead using get_default_cli.
import subprocess
import json
process = subprocess.Popen(['az','network', 'ddos-protection', 'list'], stdout=subprocess.PIPE)
out, err = process.communicate()
d = json.loads(out)
print(d)
Well, we can execute the Azure CLI commands in Python as shown below.
Here, the res variable usually stores a value of integer type and therefore we might not be able to access the json response. To store the response in a variable, we need to do cli.result.result.
from azure.cli.core import get_default_cli
def az_cli(args_str):
args = args_str.split()
cli = get_default_cli()
res = cli.invoke(args)```
if cli.result.result:
jsondata = cli.result.result
elif cli.result.error:
print(cli.result.error)

How do I get instance id/name from public ip address of VM in azure via python sdk

def get_instance_id_from_pip(self, pip):
subscription_id="69ff3a41-a66a-4d31-8c7d-9a1ef44595c3"
compute_client = ComputeManagementClient(self.credentials, subscription_id)
network_client = NetworkManagementClient(self.credentials, subscription_id)
print("Get all public IP")
for public_ip in network_client.public_ip_addresses.list_all():
if public_ip.ip_address == pip:
print(public_ip)
# Get id
pip_id= public_ip.id.split('/')
print("pip id : {}".format(pip_id))
rg_from_pip = pip_id[4].lower()
print("RG : {}".format(rg_from_pip))
pip_name = pip_id[-1]
print("pip name : {}".format(pip_name))
for vm in compute_client.virtual_machines.list_all():
vm_id = vm.id.split('/')
#print("vm ref id: {}".format(vm_id))
rg_from_vm = vm_id[4].lower()
if rg_from_pip == rg_from_vm:
## this is the VM in the same rg as pip
for ni_reference in vm.network_profile.network_interfaces:
ni_reference = ni_reference.id.split('/')
ni_name = ni_reference[8]
print("ni reference: {}".format(ni_reference))
net_interface = network_client.network_interfaces.get(rg_from_pip, ni_name)
print("net interface ref {}".format(net_interface))
public_ip_reference = net_interface.ip_configurations[0].public_ip_address
if public_ip_reference:
public_ip_reference = public_ip_reference.id.split('/')
ip_group = public_ip_reference[4]
ip_name = public_ip_reference[8]
print("IP group {}, IP name {}".format(ip_group, ip_name))
if ip_name == pip_name:
print("Thank god. Finallly !!!!")
print("VM ID :-> {}".format(vm.id))
return vm.id
I have above code to get the VM instance ID from Public ip but its not working. What is real surprising is that for all instances, I am getting x.public_ip_address.ip_address as None value.
I had multiple readthedocs references for python SDK of Azure. but, some how all links not working. Good job Azure :)
Some of them:
https://azure-sdk-for-python.readthedocs.io/en/v1.0.3/resourcemanagementcomputenetwork.html
https://azure-storage.readthedocs.io/en/latest/ref/azure.storage.file.fileservice.html
Second edit:
I got the answer to this and above code will return vm id given the public ip address. Though, you can see it is not absolutely optimized answer. Hence, better answers are welcome. Thanks!
Docs have moved here:
https://learn.microsoft.com/python/azure/
We made some redirection, but unfortunately it's not possible to go global redirection on RTD, so some pages are 404 :/
About your trouble, I would try to use the publicip operation group directly:
https://learn.microsoft.com/en-us/python/api/azure.mgmt.network.v2017_11_01.operations.publicipaddressesoperations?view=azure-python
You get this one in the Network client, as client.public_ip_addresses

Resources