Azure.mgmt.containerservice.ContainerServiceClient import fails with No module named 'azure.mgmt' - azure

I am working on automating certain tasks related to Azure Kubernetes.
For this, I want to connect to AKS to list pods and to get live logs which we get through kubectl.
However, when I import the azure module as follows
from azure.mgmt.containerservice import ContainerServiceClient
or
from azure.mgmt.kubernetesconfiguration import SourceControlConfigurationClient
It throws exception that
ModuleNotFoundError: No module named 'azure.mgmt'
I have properly installed this module in virtual env which gets listed on the pip3 list.
Is there any new way of working with AKS or container service?
Edit -
Output of pip3 list is -
Package Version
---------------------------------- ---------
azure-common 1.1.28
azure-core 1.26.3
azure-identity 1.12.0
azure-mgmt-core 1.3.2
azure-mgmt-kubernetesconfiguration 2.0.0

From the list i don't see the package,
you need to do
pip install azure-mgmt
You need to use the specific packages Starting with v5.0.0, in this case you need to install
pip install azure-mgmt-containerservice
here is the doc

I tried in my environment and got below results:
I installed package with azure-container-service with latest version by refering this document with my python version 3.10.4:
Command:
pip install azure-mgmt-containerservice==21.1.0
After installed package in my environment and I tried the below code to get the list of pods it executed successfully.
Code:
from azure.identity import DefaultAzureCredential
from azure.mgmt.containerservice import ContainerServiceClient
import os
from kubernetes import client, config
credential = DefaultAzureCredential()
subscription_id = "<your subscription id>"
resource_group_name= 'your resource name'
cluster_name = "your cluster name"
container_service_client = ContainerServiceClient(credential, subscription_id)
# getting kubeconfig in a decoded format from CredentialResult
kubeconfig = container_service_client.managed_clusters.list_cluster_user_credentials(resource_group_name, cluster_name).kubeconfigs[0].value.decode(encoding='UTF-8')
# writing generated kubeconfig in a file
f=open("kubeconfig","w")
f.write(kubeconfig)
f.close()
# loading the config file
config.load_kube_config('kubeconfig')
# deleting the kubeconfig file
os.remove('kubeconfig')
v1 = client.CoreV1Api()
ret = v1.list_pod_for_all_namespaces(watch=False)
for i in ret.items:
print("%s\t%s\t%s" % (i.status.pod_ip,i.metadata.namespace,i.metadata.name))
Output:
ip address namespace name
10.244.x.x default azure-vote-back-7cd69cc96f-xdv79
10.244.x.x default azure-vote-front-7c95676c68-52582
10.224.x.x kube-system azure-ip-masq-agent-s6vlj
10.224.x.x kube-system cloud-node-manager-mccsv
10.244.x.x kube-system coredns-59b6bf8b4f-9nr5w
Reference:
azure-samples-python-management/samples/containerservice at main · Azure-Samples/azure-samples-python-management · GitHub

The problem is solved for me for the time being i.e. I am no more seeing the error.
What I did? --> Used VS Code rather than Pycharm IDE where I was getting error.
Workaround or solution? --> This is workaround. i.e. I could manage to make it working for me and proceed with my implementation.
So problem seems to be with Pycharm IDE and not sure what's the solution for it.
Any suggestions to solve this Pycharm problem is most welcome. (I will mark that answer as accepted, in that case.)

Related

Using the Environment Class with Pipeline Runs

I am using an estimator step for a pipeline using the Environment class, in order to have a custom Docker image as I need some apt-get packages to be able to install a specific pip package. It appears from the logs that it's completely ignoring, unlike the non-pipeline version of the estimator, the docker portion of the environment variable. Very simply, this seems broken :
I'm running on SDK v1.0.65, and my dockerfile is completely ignored, I'm using
FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc
in the base_dockerfile property of my code.
Here's a snippet of my code :
from azureml.core import Environment
from azureml.core.environment import CondaDependencies
conda_dep = CondaDependencies()
conda_dep.add_pip_package('pymssql==2.1.1')
myenv = Environment(name="mssqlenv")
myenv.python.conda_dependencies=conda_dep
myenv.docker.enabled = True
myenv.docker.base_dockerfile = 'FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc'
myenv.docker.base_image = None
This works well when I use an Estimator by itself, but if I insert this estimator in a Pipeline, it fails. Here's my code to launch it from a Pipeline run:
from azureml.pipeline.steps import EstimatorStep
sql_est_step = EstimatorStep(name="sql_step",
estimator=est,
estimator_entry_script_arguments=[],
runconfig_pipeline_params=None,
compute_target=cpu_cluster)
from azureml.pipeline.core import Pipeline
from azureml.core import Experiment
pipeline = Pipeline(workspace=ws, steps=[sql_est_step])
pipeline_run = exp.submit(pipeline)
When launching this, the logs for the container building service reveal:
FROM continuumio/miniconda3:4.4.10... etc.
Which indicates it's ignoring my FROM mcr.... statement in the Environment class I've associated with this Estimator, and my pip install fails.
Am I missing something? Is there a workaround?
I can confirm that this is a bug on the AML Pipeline side. Specifically, the runconfig property environment.docker.base_dockerfile is not being passed through correctly in pipeline jobs. We are working on a fix. In the meantime, you can use the workaround from this thread of building the docker image first and specifying it with environment.docker.base_image (which is passed through correctly).
I found a workaround for now, which is to build your own Docker image. You can do this by using these options of the DockerSection of the Environment :
myenv.docker.base_image_registry.address = '<your_acr>.azurecr.io'
myenv.docker.base_image_registry.username = '<your_acr>'
myenv.docker.base_image_registry.password = '<your_acr_password>'
myenv.docker.base_image = '<your_acr>.azurecr.io/testimg:latest'
and use obviously whichever docker image you built and pushed to the container registry linked to the Azure Machine Learning Workspace.
To create the image, you would run something like this at the command line of a machine that can build a linux based container (like a Notebook VM):
docker build . -t <your_image_name>
# Tag it for upload
docker tag <your_image_name:latest <your_acr>.azurecr.io/<your_image_name>:latest
# Login to Azure
az login
# login to the container registry so that the push will work
az acr login --name <your_acr>
# push the image
docker push <your_acr>.azurecr.io/<your_image_name>:latest
Once the image is pushed, you should be able to get that working.
I also initially used EstimatorStep for custom images, but recently have figured out how to successfully pass Environment's first to RunConfiguration's, then to PythonScriptStep's. (example below)
Another workaround similar to your workaround would be to publish your custom docker image to Docker hub, then the param, docker_base_image becomes the URI, in our case mmlspark:0.16.
def get_environment(env_name, yml_path, user_managed_dependencies, enable_docker, docker_base_image):
env = Environment(env_name)
cd = CondaDependencies(yml_path)
env.python.conda_dependencies = cd
env.python.user_managed_dependencies = user_managed_dependencies
env.docker.enabled = enable_docker
env.docker.base_image = docker_base_image
return env
spark_env = f.get_environment(env_name='spark_env',
yml_path=os.path.join(os.getcwd(), 'compute/aml_config/spark_compute_dependencies.yml'),
user_managed_dependencies=False, enable_docker=True,
docker_base_image='microsoft/mmlspark:0.16')
# use pyspark framework
spark_run_config = RunConfiguration(framework="pyspark")
spark_run_config.environment = spark_env
roll_step = PythonScriptStep(
name='rolling window',
script_name='roll.py',
arguments=['--input_dir', joined_data,
'--output_dir', rolled_data,
'--script_dir', ".",
'--min_date', '2015-06-30',
'--pct_rank', 'True'],
compute_target=compute_target_spark,
inputs=[joined_data],
outputs=[rolled_data],
runconfig=spark_run_config,
source_directory=os.path.join(os.getcwd(), 'compute', 'roll'),
allow_reuse=pipeline_reuse
)
A couple of other points (that may be wrong):
PythonScriptStep is effectively a wrapper for ScriptRunConfig, which takes run_config as an argument
Estimator is a wrapper for ScriptRunConfig where RunConfig settings are made available as parameters
IMHO EstimatorStep shouldn't exist because it is better to define Env's and Steps separately instead of at the same time in one call.

Unable to run tasks on Azure Batch:Nodes go into unusable state after starting-up

I am trying to parallelize a Python App using Azure Batch.The workflow that I have followed in the Python client-side script is:
1) Upload Local files to Azure Blob Container using blobxfer utility (input-container)
2) Start Batch service to Process the files in input-container after logging in using the service principal account with azure-cli.
3) Upload the files to output-container through the python app distributed across the Nodes with Azure Batch.
I am experiencing a problem very similar to the one I read here but unfortunately no solution was given in this post.
Nodes go into Unusable State
I will now give the relevant information so that one can reproduce this error:
The image that was used for Azure Batch is custom.
1) Ubuntu Server 18.04 LTS was chosen as the OS for the VM and the following ports were opened-ssh,http,https.The rest of the setting were kept default in the Azure portal.
2)The following script was run once the server was available.
sudo apt-get install build-essential checkinstall -y
sudo apt-get install libreadline-gplv2-dev libncursesw5-dev libssl-dev
libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev -y
cd /usr/src
sudo wget https://www.python.org/ftp/python/3.6.6/Python-3.6.6.tgz
sudo tar xzf Python-3.6.6.tgz
cd Python-3.6.6
sudo ./configure --enable-optimizations
sudo make altinstall
sudo pip3.6 install --upgrade pip
sudo pip3.6 install pymupdf==1.13.20
sudo pip3.6 install tqdm==4.19.9
sudo pip3.6 install sentry-sdk==0.4.1
sudo pip3.6 install blobxfer==1.5.0
sudo pip3.6 install azure-cli==2.0.47
3) An Image of this server was created using the process outlined in this link.
Creating VM Image in Azure Linux
Also during deprovision the user was not deleted:sudo waagent -deprovision
4) The Resource Id of the image was noted from the Azure Portal.This will be supplied as one of the parameters in the python-client-side script
The packages installed on the client side server where the python script for Batch would run
sudo pip3.6 install tqdm==4.19.9
sudo pip3.6 install sentry-sdk==0.4.1
sudo pip3.6 install blobxfer==1.5.0
sudo pip3.6 install azure-cli==2.0.47
sudo pip3.6 install pandas==0.22.0
The Resources used during Azure Batch were created in the following way:
1) Service Principal account with contributor privileges was created using the cmd.
$az ad sp create-for-rbac --name <SERVICE-PRINCIPAL-ACCOUNT>
2) Resource-Group,Batch-Account and Storage associated with Batch Account were created in the following way:
$ az group create --name <RESOURCE-GROUP-NAME> --location eastus2
$ az storage account create --resource-group <RESOURCE-GROUP-NAME> --name <STORAGE-ACCOUNT-NAME> --location eastus2 --sku Standard_LRS
$ az batch account create --name <BATCH-ACCOUNT-NAME> --storage-account <STORAGE-ACCOUNT-NAME> --resource-group <RESOURCE-GROUP-NAME> --location eastus2
The client-side Python script which initiates the upload and processing:
(Update 3)
import subprocess
import os
import time
import datetime
import tqdm
import pandas
import sys
import fitz
import parmap
import numpy as np
import sentry_sdk
import multiprocessing as mp
def batch_upload_local_to_azure_blob(azure_username,azure_password,azure_tenant,azure_storage_account,azure_storage_account_key,log_dir_path):
try:
subprocess.check_output(["az","login","--service-principal","--username",azure_username,"--password",azure_password,"--tenant",azure_tenant])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Invalid Azure Login Credentials")
sys.exit("Invalid Azure Login Credentials")
dir_flag=False
while dir_flag==False:
try:
no_of_dir=input("Enter the number of directories to upload:")
no_of_dir=int(no_of_dir)
if no_of_dir<0:
print("\nRetry:Enter an integer value")
else:
dir_flag=True
except ValueError:
print("\nRetry:Enter an integer value")
dir_path_list=[]
for dir in range(no_of_dir):
path_exists=False
while path_exists==False:
dir_path=input("\nEnter the local absolute path of the directory no.{}:".format(dir+1))
print("\n")
dir_path=dir_path.replace('"',"")
path_exists=os.path.isdir(dir_path)
if path_exists==True:
dir_path_list.append(dir_path)
else:
print("\nRetry:Enter a valid directory path")
timestamp = time.time()
timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
input_azure_container="pdf-processing-input"+"-"+timestamp_humanreadable
try:
subprocess.check_output(["az","storage","container","create","--name",input_azure_container,"--account-name",azure_storage_account,"--auth-mode","login","--fail-on-exist"])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Invalid Azure Storage Credentials.")
sys.exit("Invalid Azure Storage Credentials.")
log_file_path=os.path.join(log_dir_path,"upload-logs"+"-"+timestamp_humanreadable+".txt")
dir_upload_success=[]
dir_upload_failure=[]
for dir in tqdm.tqdm(dir_path_list,desc="Uploading Directories"):
try:
subprocess.check_output(["blobxfer","upload","--remote-path",input_azure_container,"--storage-account",azure_storage_account,\
"--enable-azure-storage-logger","--log-file",\
log_file_path,"--storage-account-key",azure_storage_account_key,"--local-path",dir])
dir_upload_success.append(dir)
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Failed to upload directory: {}".format(dir))
dir_upload_failure.append(dir)
return(input_azure_container)
def query_azure_storage(azure_storage_container,azure_storage_account,azure_storage_account_key,blob_file_path):
try:
blob_list=subprocess.check_output(["az","storage","blob","list","--container-name",azure_storage_container,\
"--account-key",azure_storage_account_key,"--account-name",azure_storage_account,"--auth-mode","login","--output","tsv"])
blob_list=blob_list.decode("utf-8")
with open(blob_file_path,"w") as f:
f.write(blob_list)
blob_df=pandas.read_csv(blob_file_path,sep="\t",header=None)
blob_df=blob_df.iloc[:,3]
blob_df=blob_df.to_frame(name="container_files")
blob_df=blob_df.assign(container=azure_storage_container)
return(blob_df)
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Invalid Azure Storage Credentials")
sys.exit("Invalid Azure Storage Credentials.")
def analyze_files_for_tasks(data_split,azure_storage_container,azure_storage_account,azure_storage_account_key,download_folder):
try:
blob_df=data_split
some_calculation_factor=2
analyzed_azure_blob_df=pandas.DataFrame()
analyzed_azure_blob_df=analyzed_azure_blob_df.assign(container="empty",container_files="empty",pages="empty",max_time="empty")
for index,row in blob_df.iterrows():
file_to_analyze=os.path.join(download_folder,row["container_files"])
subprocess.check_output(["az","storage","blob","download","--container-name",azure_storage_container,"--file",file_to_analyze,"--name",row["container_files"],\
"--account-name",azure_storage_account,"--auth-mode","key"]) #Why does login auth not work for this while we are multiprocessing
doc=fitz.open(file_to_analyze)
page_count=doc.pageCount
analyzed_azure_blob_df=analyzed_azure_blob_df.append([{"container":azure_storage_container,"container_files":row["container_files"],"pages":page_count,"max_time":some_calculation_factor*page_count}])
doc.close()
os.remove(file_to_analyze)
return(analyzed_azure_blob_df)
except Exception as e:
sentry_sdk.capture_exception(e)
def estimate_task_completion_time(azure_storage_container,azure_storage_account,azure_storage_account_key,azure_blob_df,azure_blob_downloads_file_path):
try:
cores=mp.cpu_count() #Number of CPU cores on your system
partitions = cores-2
timestamp = time.time()
timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
file_download_location=os.path.join(azure_blob_downloads_file_path,"Blob_Download"+"-"+timestamp_humanreadable)
os.mkdir(file_download_location)
data_split = np.array_split(azure_blob_df,indices_or_sections=partitions,axis=0)
analyzed_azure_blob_df=pandas.concat(parmap.map(analyze_files_for_tasks,data_split,azure_storage_container,azure_storage_account,azure_storage_account_key,file_download_location,\
pm_pbar=True,pm_processes=partitions))
analyzed_azure_blob_df=analyzed_azure_blob_df.reset_index(drop=True)
return(analyzed_azure_blob_df)
except Exception as e:
sentry_sdk.capture_exception(e)
sys.exit("Unable to Estimate Job Completion Status")
def azure_batch_create_pool(azure_storage_container,azure_resource_group,azure_batch_account,azure_batch_account_endpoint,azure_batch_account_key,vm_image_name,no_nodes,vm_compute_size,analyzed_azure_blob_df):
timestamp = time.time()
timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
pool_id="pdf-processing"+"-"+timestamp_humanreadable
try:
subprocess.check_output(["az","batch","account","login","--name", azure_batch_account,"--resource-group",azure_resource_group])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to log into the Batch account")
sys.exit("Unable to log into the Batch account")
#Pool autoscaling formula would go in here
try:
subprocess.check_output(["az","batch","pool","create","--account-endpoint",azure_batch_account_endpoint, \
"--account-key",azure_batch_account_key,"--account-name",azure_batch_account,"--id",pool_id,\
"--node-agent-sku-id","batch.node.ubuntu 18.04",\
"--image",vm_image_name,"--target-low-priority-nodes",str(no_nodes),"--vm-size",vm_compute_size])
return(pool_id)
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to create a Pool corresponding to Container:{}".format(azure_storage_container))
sys.exit("Unable to create a Pool corresponding to Container:{}".format(azure_storage_container))
def azure_batch_create_job(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info):
timestamp = time.time()
timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
job_id="pdf-processing-job"+"-"+timestamp_humanreadable
try:
subprocess.check_output(["az","batch","job","create","--account-endpoint",azure_batch_account_endpoint,"--account-key",\
azure_batch_account_key,"--account-name",azure_batch_account,"--id",job_id,"--pool-id",pool_info])
return(job_id)
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to create a Job on the Pool :{}".format(pool_info))
sys.exit("Unable to create a Job on the Pool :{}".format(pool_info))
def azure_batch_create_task(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info,job_info,azure_storage_account,azure_storage_account_key,azure_storage_container,analyzed_azure_blob_df):
print("\n")
for i in tqdm.tqdm(range(180),desc="Waiting for the Pool to Warm-up"):
time.sleep(1)
successful_task_list=[]
unsuccessful_task_list=[]
input_azure_container=azure_storage_container
output_azure_container= "pdf-processing-output"+"-"+input_azure_container.split("-input-")[-1]
try:
subprocess.check_output(["az","storage","container","create","--name",output_azure_container,"--account-name",azure_storage_account,"--auth-mode","login","--fail-on-exist"])
except subprocess.CalledProcessError:
sentry_sdk.cpature_message("Unable to create an output container")
sys.exit("Unable to create an output container")
print("\n")
pbar = tqdm.tqdm(total=analyzed_azure_blob_df.shape[0],desc="Creating and distributing Tasks")
for index,row in analyzed_azure_blob_df.iterrows():
try:
task_info="mytask-"+str(index)
subprocess.check_output(["az","batch","task","create","--task-id",task_info,"--job-id",job_info,"--command-line",\
"python3 /home/avadhut/pdf_processing.py {} {} {}".format(input_azure_container,output_azure_container,row["container_files"])])
pbar.update(1)
except subprocess.CalledProcessError:
sentry_sdk.capture_message("unable to create the Task: mytask-{}".format(i))
pbar.update(1)
pbar.close()
def wait_for_tasks_to_complete(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,job_info,task_file_path,analyzed_azure_blob_df):
try:
print(analyzed_azure_blob_df)
nrows_tasks_df=analyzed_azure_blob_df.shape[0]
print("\n")
pbar=tqdm.tqdm(total=nrows_tasks_df,desc="Waiting for task to complete")
for index,row in analyzed_azure_blob_df.iterrows():
task_list=subprocess.check_output(["az","batch","task","list","--job-id",job_info,"--account-endpoint",azure_batch_account_endpoint,"--account-key",azure_batch_account_key,"--account-name",azure_batch_account,\
"--output","tsv"])
task_list=task_list.decode("utf-8")
with open(task_file_path,"w") as f:
f.write(task_list)
task_df=pandas.read_csv(task_file_path,sep="\t",header=None)
task_df=task_df.iloc[:,21]
active_task_list=[]
for x in task_df:
if x =="active":
active_task_list.append(x)
if len(active_task_list)>0:
time.sleep(row["max_time"]) #This time can be changed in accordance with the time taken to complete each task
pbar.update(1)
continue
else:
pbar.close()
return("success")
pbar.close()
return("failure")
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Error in retrieving task status")
def azure_delete_job(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,job_info):
try:
subprocess.check_output(["az","batch","job","delete","--job-id",job_info,"--account-endpoint",azure_batch_account_endpoint,"--account-key",azure_batch_account_key,"--account-name",azure_batch_account,"--yes"])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to delete Job-{}".format(job_info))
def azure_delete_pool(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info):
try:
subprocess.check_output(["az","batch","pool","delete","--pool-id",pool_info,"--account-endpoint",azure_batch_account_endpoint,"--account-key",azure_batch_account_key,"--account-name",azure_batch_account,"--yes"])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to delete Pool--{}".format(pool_info))
if __name__=="__main__":
print("\n")
print("-"*40+"Azure Batch processing POC"+"-"*40)
print("\n")
#Credentials and initializations
sentry_sdk.init(<SENTRY-CREDENTIALS>) #Sign-up for a Sentry trail account
azure_username=<AZURE-USERNAME>
azure_password=<AZURE-PASSWORD>
azure_tenant=<AZURE-TENANT>
azure_resource_group=<RESOURCE-GROUP-NAME>
azure_storage_account=<STORAGE-ACCOUNT-NAME>
azure_storage_account_key=<STORAGE-KEY>
azure_batch_account_endpoint=<BATCH-ENDPOINT>
azure_batch_account_key=<BATCH-ACCOUNT-KEY>
azure_batch_account=<BATCH-ACCOUNT-NAME>
vm_image_name=<VM-IMAGE>
vm_compute_size="Standard_A4_v2"
no_nodes=2
log_dir_path="/home/user/azure_batch_upload_logs/"
azure_blob_downloads_file_path="/home/user/blob_downloads/"
blob_file_path="/home/user/azure_batch_upload.tsv"
task_file_path="/home/user/azure_task_list.tsv"
input_azure_container=batch_upload_local_to_azure_blob(azure_username,azure_password,azure_tenant,azure_storage_account,azure_storage_account_key,log_dir_path)
azure_blob_df=query_azure_storage(input_azure_container,azure_storage_account,azure_storage_account_key,blob_file_path)
analyzed_azure_blob_df=estimate_task_completion_time(input_azure_container,azure_storage_account,azure_storage_account_key,azure_blob_df,azure_blob_downloads_file_path)
pool_info=azure_batch_create_pool(input_azure_container,azure_resource_group,azure_batch_account,azure_batch_account_endpoint,azure_batch_account_key,vm_image_name,no_nodes,vm_compute_size,analyzed_azure_blob_df)
job_info=azure_batch_create_job(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info)
azure_batch_create_task(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info,job_info,azure_storage_account,azure_storage_account_key,input_azure_container,analyzed_azure_blob_df)
task_status=wait_for_tasks_to_complete(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,job_info,task_file_path,analyzed_azure_blob_df)
if task_status=="success":
azure_delete_job(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,job_info)
azure_delete_pool(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info)
print("\n\n")
sys.exit("Job Complete")
else:
azure_delete_job(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,job_info)
azure_delete_pool(azure_batch_account,azure_batch_account_key,azure_batch_account_endpoint,pool_info)
print("\n\n")
sys.exit("Job Unsuccessful")
cmd used to create the zip file:
zip pdf_process_1.zip pdf_processing.py
The Python App that was packaged in zip file and uploaded to batch through the client-side script
(Update 3)
import os
import fitz
import subprocess
import argparse
import time
from tqdm import tqdm
import sentry_sdk
import sys
import datetime
def azure_active_directory_login(azure_username,azure_password,azure_tenant):
try:
azure_login_output=subprocess.check_output(["az","login","--service-principal","--username",azure_username,"--password",azure_password,"--tenant",azure_tenant])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Invalid Azure Login Credentials")
sys.exit("Invalid Azure Login Credentials")
def download_from_azure_blob(azure_storage_account,azure_storage_account_key,input_azure_container,file_to_process,pdf_docs_path):
file_to_download=os.path.join(input_azure_container,file_to_process)
try:
subprocess.check_output(["az","storage","blob","download","--container-name",input_azure_container,"--file",os.path.join(pdf_docs_path,file_to_process),"--name",file_to_process,"--account-key",azure_storage_account_key,\
"--account-name",azure_storage_account,"--auth-mode","login"])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("unable to download the pdf file")
sys.exit("unable to download the pdf file")
def pdf_to_png(input_folder_path,output_folder_path):
pdf_files=[x for x in os.listdir(input_folder_path) if x.endswith((".pdf",".PDF"))]
pdf_files.sort()
for pdf in tqdm(pdf_files,desc="pdf--->png"):
doc=fitz.open(os.path.join(input_folder_path,pdf))
page_count=doc.pageCount
for f in range(page_count):
page=doc.loadPage(f)
pix = page.getPixmap()
if pdf.endswith(".pdf"):
png_filename=pdf.split(".pdf")[0]+"___"+"page---"+str(f)+".png"
pix.writePNG(os.path.join(output_folder_path,png_filename))
elif pdf.endswith(".PDF"):
png_filename=pdf.split(".PDF")[0]+"___"+"page---"+str(f)+".png"
pix.writePNG(os.path.join(output_folder_path,png_filename))
def upload_to_azure_blob(azure_storage_account,azure_storage_account_key,output_azure_container,png_docs_path):
try:
subprocess.check_output(["az","storage","blob","upload-batch","--destination",output_azure_container,"--source",png_docs_path,"--account-key",azure_storage_account_key,\
"--account-name",azure_storage_account,"--auth-mode","login"])
except subprocess.CalledProcessError:
sentry_sdk.capture_message("Unable to upload file to the container")
if __name__=="__main__":
#Credentials
sentry_sdk.init(<SENTRY-CREDENTIALS>)
azure_username=<AZURE-USERNAME>
azure_password=<AZURE-PASSWORD>
azure_tenant=<AZURE-TENANT>
azure_storage_account=<AZURE-STORAGE-NAME>
azure_storage_account_key=<AZURE-STORAGE-KEY>
try:
parser = argparse.ArgumentParser()
parser.add_argument("input_azure_container",type=str,help="Location to download files from")
parser.add_argument("output_azure_container",type=str,help="Location to upload files to")
parser.add_argument("file_to_process",type=str,help="file link in azure blob storage")
args = parser.parse_args()
timestamp = time.time()
timestamp_humanreadable= datetime.datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d-%H-%M-%S')
task_working_dir=os.getcwd()
file_to_process=args.file_to_process
input_azure_container=args.input_azure_container
output_azure_container=args.output_azure_container
pdf_docs_path=os.path.join(task_working_dir,"pdf_files"+"-"+timestamp_humanreadable)
png_docs_path=os.path.join(task_working_dir,"png_files"+"-"+timestamp_humanreadable)
os.mkdir(pdf_docs_path)
os.mkdir(png_docs_path)
except Exception as e:
sentry_sdk.capture_exception(e)
azure_active_directory_login(azure_username,azure_password,azure_tenant)
download_from_azure_blob(azure_storage_account,azure_storage_account_key,input_azure_container,file_to_process,pdf_docs_path)
pdf_to_png(pdf_docs_path,png_docs_path)
upload_to_azure_blob(azure_storage_account,azure_storage_account_key,output_azure_container,png_docs_path)
Update 1:
I have solved the Server Nodes going into unusable state error.The way I solved this issue is:
1) I did not use the cmds I mentioned above to set up Python env 3.6 on Ubuntu as Ubuntu 18.04 LTS comes with its own python 3 environment.Initially I had googled "Install Python 3 on Ubuntu" and had gotten this Python 3.6 installation on Ubuntu link.Avoided this step completely during the server set-up.
All I did was install these packages this time.
sudo apt-get install -y python3-pip
sudo -H pip3 install tqdm==4.19.9
sudo -H pip3 install sentry-sdk==0.4.1
sudo -H pip3 install blobxfer==1.5.0
sudo -H pip3 install pandas==0.22.0
The Azure cli was installed on the machine using the cmds in this link
Install Azure CLI with apt
2) Created a snapshot of the OS-disk and then created the image out of this snapshot and finally referencing this image in the client-side script.
I am now faced with another issue where the stderr.txt files on the node tell me that:
python3: can't open file '$AZ_BATCH_APP_PACKAGE_pdfprocessingapp/pdf_processing.py': [Errno 2] No such file or directory
Logging in to the server with the random user I see that the directory _azbatch is created but there are no contents inside this directory.
I know for certain that it is the command line of the azure_batch_create_task() function that things are going haywire but I am not able to put my finger on it.I have done everything that this docs recommends:Install app packages to Azure Batch Compute Nodes Please review my client-side Python script and let me know on what I am doing wrong!
Edit 3:
The problem looks very similar to the one described in this post:
Unable to pass app path to Tasks
Update 2:
I was able to overcome the file/directory not found error using a dirty hack which i am not particularly fond of.I placed the python app in the home directory of the user which was used to create the VM and all the directories required for processing were created in the working directory of the task.
I still would want to know how I would run the workflow by using the application package way to deploy it to the node.
Update 3
I have updated the client side code and python app to reflect the latest changes made. Things that are significant are the same.....
I will comment on #fparks points that he/she has raised.
The Original python App that I intend to use in Azure Batch contains many modules and some config files and a quite lengthy requirements.txt file for Python packages.Azure also recommends using custom Image in such cases.
Also downloading the python modules per Task is a bit irrational in my case as 1 task is equal to a multipage pdfs and my expected workload is 25k multipage pdfs
I used CLI because the docs for Python SDK were sparse and hard to follow.The nodes going into unusable state has been solved.I do agree with you on the blobxfer error.
Answers and a few observations:
It is unclear to me why you need a custom image. You can use a platform image, i.e., Canonical, UbuntuServer, 18.04-LTS, and then just install what you need as part of the start task. Python3.6 can simply be installed via apt in 18.04. You may be prematurely optimizing your workflow by opting for a custom image when in fact using a platform image + start task may be faster and stable.
Your script is in Python, yet you are calling out to the Azure CLI. You may want to consider directly using the Azure Batch Python SDK instead (samples).
When nodes go unusable, you should first examine the node for errors. You should see if the ComputeNodeError field is populated. Additionally, you can try to fetch stdout.txt and stderr.txt files from the startup directory to diagnose what's going on. You can do both of these actions in the Azure Portal or via Batch Explorer. If that doesn't work, you can fetch the compute node service logs and file a support request. However, typically unusable means that your custom image was provisioned incorrectly, you have a virtual network with an NSG misconfigured, or you have an application package that is incorrect.
Your application package consists of a single python file; instead use a resource file. Simply upload the script to Azure Storage blob and reference it in your task as a Resource File using a SAS URL. See the --resource-files argument in az batch task create if using the CLI. Your command to invoke would then simply be python3 pdf_processing.py (assuming you keep the resource file downloading to the task working directory).
If you insist on using an application package, consider using a task application package instead. This will decouple your node startup issues potentially originating from bad application packages to debugging task executions instead.
The blobxfer error is pretty clear. Your locale is not set properly. The easy way to fix this is to set the environment variables for the task. See the --environment-settings argument if using the CLI and set two environment variables LC_ALL=C.UTF-8 and LANG=C.UTF-8 as part of your task.

Why am I getting : Unable to import module 'handler': No module named 'paramiko'?

I was in the need to move files with a aws-lambda from a SFTP server to my AWS account,
then I've found this article:
https://aws.amazon.com/blogs/compute/scheduling-ssh-jobs-using-aws-lambda/
Talking about paramiko as a SSHclient candidate to move files over ssh.
Then I've written this calss wrapper in python to be used from my serverless handler file:
import paramiko
import sys
class FTPClient(object):
def __init__(self, hostname, username, password):
"""
creates ftp connection
Args:
hostname (string): endpoint of the ftp server
username (string): username for logging in on the ftp server
password (string): password for logging in on the ftp server
"""
try:
self._host = hostname
self._port = 22
#lets you save results of the download into a log file.
#paramiko.util.log_to_file("path/to/log/file.txt")
self._sftpTransport = paramiko.Transport((self._host, self._port))
self._sftpTransport.connect(username=username, password=password)
self._sftp = paramiko.SFTPClient.from_transport(self._sftpTransport)
except:
print ("Unexpected error" , sys.exc_info())
raise
def get(self, sftpPath):
"""
creates ftp connection
Args:
sftpPath = "path/to/file/on/sftp/to/be/downloaded"
"""
localPath="/tmp/temp-download.txt"
self._sftp.get(sftpPath, localPath)
self._sftp.close()
tmpfile = open(localPath, 'r')
return tmpfile.read()
def close(self):
self._sftpTransport.close()
On my local machine it works as expected (test.py):
import ftp_client
sftp = ftp_client.FTPClient(
"host",
"myuser",
"password")
file = sftp.get('/testFile.txt')
print(file)
But when I deploy it with serverless and run the handler.py function (same as the test.py above) I get back the error:
Unable to import module 'handler': No module named 'paramiko'
Looks like the deploy is unable to import paramiko (by the article above it seems like it should be available for lambda python 3 on AWS) isn't it?
If not what's the best practice for this case? Should I include the library into my local project and package/deploy it to aws?
A comprehensive guide tutorial exists at :
https://serverless.com/blog/serverless-python-packaging/
Using the serverless-python-requirements package
as serverless node plugin.
Creating a virtual env and Docker Deamon will be required to packup your serverless project before deploying on AWS lambda
In the case you use
custom:
pythonRequirements:
zip: true
in your serverless.yml, you have to use this code snippet at the start of your handler
try:
import unzip_requirements
except ImportError:
pass
all details possible to find in Serverless Python Requirements documentation
You have to create a virtualenv, install your dependencies and then zip all files under sites-packages/
sudo pip install virtualenv
virtualenv -p python3 myvirtualenv
source myvirtualenv/bin/activate
pip install paramiko
cp handler.py myvirtualenv/lib/python
zip -r myvirtualenv/lib/python3.6/site-packages/ -O package.zip
then upload package.zip to lambda
You have to provide all dependencies that are not installed in AWS' Python runtime.
Take a look at Step 7 in the tutorial. Looks like he is adding the dependencies from the virtual environment to the zip file. So I'd assume your ZIP file to contain the following:
your worker_function.py on top level
a folder paramico with the files installed in virtual env
Please let me know if this helps.
I tried various blogs and guides like:
web scraping with lambda
AWS Layers for Pandas
spending hours of trying out things. Facing SIZE issues like that or being unable to import modules etc.
.. and I nearly reached the end (that is to invoke LOCALLY my handler function), but then my function even though it was fully deployed correctly and even invoked LOCALLY with no problems, then it was impossible to invoke it on AWS.
The most comprehensive and best by far guide or example that is ACTUALLY working is the above mentioned by #koalaok ! Thanks buddy!
actual link

How to import terraform policy attachment?

Our main goal is to move some resources to a different terraform state fle. I am trying to import a policy attachment of a resource ,however seems like it does not support importing of policy attachment . i am getting an error.
What is the other alternative if it does not support?
i am trying to import this policy
+ aws_iam_role_policy_attachment.gitlab_as_attach
id: <computed>
policy_arn: "arn:aws:iam::xxxxxxxxxxxx:policy/gitlab_as_policy"
role: "gitlab_prod"
error:
terraform import aws_iam_role_policy_attachment.gitlab_as_attach arn:aws:iam::xxxxxxxxx:policy/gitlab_as_policy
aws_iam_role_policy_attachment.gitlab_as_attach: Importing from ID "arn:aws:iam::xxxxxxxx:policy/gitlab_as_policy"...
Error importing: 1 error(s) occurred:
* aws_iam_role_policy_attachment.gitlab_as_attach (import id: arn:aws:iam::xxxxxxxxxx:policy/gitlab_as_policy): import aws_iam_role_policy_attachment.gitlab_as_attach (id: arn:aws:iam::xxxxxxxxxx:policy/gitlab_as_policy): resource aws_iam_role_policy_attachment doesn't support import
terraform version:
Terraform v0.11.0
+ provider.aws v1.5.0
This issue is fixed in 1.37.0 for the provider.aws plugin. Do upgrade the plugins and modules related to the terraform.
To upgrade the plugins run the below command
terraform init -upgrade
To upgrade the modules run the below command
terraform get -update
For further information, look up at the defects and enhancements related to terraform
https://github.com/terraform-providers/terraform-provider-aws/blob/master/CHANGELOG.md#1370-september-19-2018
I ran import for the aws_iam_role_policy_attachment today and it's successful.
terraform import -provider=aws.{example} aws_iam_role_policy_attachment.role-attach-1 {test-role}/arn:aws:iam::aws:policy/ReadOnlyAccess
aws_iam_role_policy_attachment.role-attach-1: Importing from ID "{test-role}/arn:aws:iam::aws:policy/ReadOnlyAccess"...
aws_iam_role_policy_attachment.role-attach-1: Import complete!
Imported aws_iam_role_policy_attachment (ID: {test-role}-arn:aws:iam::aws:policy/ReadOnlyAccess)
aws_iam_role_policy_attachment.role-attach-1: Refreshing state... (ID: {test-role}-arn:aws:iam::aws:policy/ReadOnlyAccess)
I hope this helps.
EDIT: a new PR was written and merged, and a new version of the AWS Terraform provider (1.37.0) was released adding this feature. This answer is now not really valid anymore; see Momooo's answer for how to do this.
Unfortunately this has been an open issue in the AWS Terraform provider for a while, and the PR that would fix it was abandoned. You could try to detach the policy, refresh terraform, perform the import, then re-attach after the import.
Based on #Momooo's response, I was able to import user policy attachment like this:
terraform import aws_iam_user_policy_attachment.TERRAFORM_RESOURCE_NAME USER_NAME/POLICY_ARN

Puppet not recognising my module

I am trying to create a custom provider for package but for some reasons I keep on getting
err: Could not run Puppet configuration client: Parameter provider
failed: Invalid package provider 'piprs' at
/usr/local/src/ops/services/puppet/modules/test/manifests/init.pp:5
I have added pluginsync=true in puppet.conf in both client and server. I have created the following rb file in module/test/lib/puppet/provider/package/piprs.rb. I am basically trying to create a custom provider for package resource type
#require 'puppet/provider/package'
Puppet::Type.type(:package).provide(:piprs,
:parent => ::Puppet::Provider::Package) do
commands : pip => "/usr/local/bin/pip"
desc "Python packages via `pip`."
def create
pip "freeze"
end
def destroy
end
def exists?
end
end
In the puppet.conf, there is the following source attribute
pluginsource = puppet://puppet/plugins
I am not sure what it is. If you need anymore details, please do post a comment.
First things first - you do realize there is already a Python pip provider in core?
https://github.com/puppetlabs/puppet/blob/master/lib/puppet/provider/package/pip.rb
If that isn't what you want - then lets move on ...
For starters - try your module without a Puppet master - this is going to be better for development anyway. You need to make sure Ruby can find the library path:
export RUBYLIB=<path_to_module>/lib
Then, try writing a small test in a .pp file:
package { "mypackage": provider => "piprs" }
And run it locally:
puppet apply mytest.pp
This will rule out a code bug in your provider versus a plugin sync issue.
I notice there is a space between the colon and the command - that isn't your problem is it?
commands : pip => "/usr/local/bin/pip"
If you can get this working without a puppetmaster, your problem is sync related.
There are a couple of things that can go wrong - make sure the file is sync'd properly on the client:
ls /var/lib/puppet/lib/puppet/provider/package
You should see the piprs.rb file there. If it is, you may need to make sure your libdir is set correctly:
puppet --configprint libdir
This should point to /var/lib/puppet/lib in most cases.

Resources