How to export metrics from a containerized component in kubeflow pipelines 0.2.5 - python-3.x

I have a pipeline made up out of 3 containerized components. In the last component I write the metrics I want to a file named /mlpipeline-metrics.json, just like it's explained here.
This is the Python code I used.
metrics = {
'metrics': [
{
'name': 'accuracy',
'numberValue': accuracy,
'format': 'PERCENTAGE',
},
{
'name': 'average-f1-score',
'numberValue': average_f1_score,
'format': 'PERCENTAGE'
},
]
}
with open('/mlpipeline-metrics.json', 'w') as f:
json.dump(metrics, f)
I also tried writing the file with the following code, just like in the example linked above.
with file_io.FileIO('/mlpipeline-metrics.json', 'w') as f:
json.dump(metrics, f)
The pipeline runs just fine without any errors. But it won't show the metrics in the front-end UI.
I'm thinking it has something to do with the following codeblock.
def metric_op(accuracy, f1_scores):
return dsl.ContainerOp(
name='visualize_metrics',
image='gcr.io/mgcp-1190085-asml-lpd-dev/kfp/jonas/container_tests/image_metric_comp',
arguments=[
'--accuracy', accuracy,
'--f1_scores', f1_scores,
]
)
This is the code I use to create a ContainerOp from the containerized component. Notice I have not specified any file_outputs.
In other ContainerOp I have to specify file_outputs to be able to pass variables to the next steps in the pipeline. Should I do something similar here to map the /mlpipeline-metrics.json onto something so that kubeflow pipelines detects it?
I'm using a managed AI platform pipelines deployment running Kubeflow Pipelines 0.2.5 with Python 3.6.8.
Any help is appreciated.

So after some trial and error I finally came to a solution. And I'm happy to say that my intuition was right. It did have something to do with the file_outputs I didn't specify.
To be able to export your metrics you will have to set file_outputs as follows.
def metric_op(accuracy, f1_scores):
return dsl.ContainerOp(
name='visualize_metrics',
image='gcr.io/mgcp-1190085-asml-lpd-dev/kfp/jonas/container_tests/image_metric_comp',
arguments=[
'--accuracy', accuracy,
'--f1_scores', f1_scores,
],
file_outputs={
'mlpipeline-metrics': '/mlpipeline-metrics.json'
}
)

Here is another way of showing metrics when you write python functions based method:
# Define your components code as standalone python functions:======================
def add(a: float, b: float) -> NamedTuple(
'AddOutput',
[
('sum', float),
('mlpipeline_metrics', 'Metrics')
]
):
'''Calculates sum of two arguments'''
sum = a+b
metrics = {
'add_metrics': [
{
'name': 'sum',
'numberValue': float(sum),
}
]
}
print("Add Result: ", sum) # this will print it online in the 'main-logs' of each task
from collections import namedtuple
addOutput = namedtuple(
'AddOutput',
['sum', 'mlpipeline_metrics'])
return addOutput(sum, metrics) # the metrics will be uploaded to the cloud
Note: I am jsut using a basci function here. I am not using your function.

Related

Run databricks job from notebook

I want to know if it is possible to run a Databricks job from a notebook using code, and how to do it
I have a job with multiple tasks, and many contributors, and we have a job created to execute it all, now we want to run the job from a notebook to test new features without creating a new task in the job, also for running the job multiple times in a loop, for example:
for i in [1,2,3]:
run job with parameter i
Regards
what you need to do is the following:
install the databricksapi. %pip install databricksapi==1.8.1
Create your job and return an output. You can do that by exiting the notebooks like that:
import json dbutils.notebook.exit(json.dumps({"result": f"{_result}"}))
If you want to pass a dataframe, you have to pass them as json dump too, there is some official documentation about that from databricks. check it out.
Get the job id you will need it later. You can get it from the jobs details in databricks.
In the executors notebook you can use the following code.
def run_ks_job_and_return_output(params):
context = json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
# context
url = context['extraContext']['api_url']
token = context['extraContext']['api_token']
jobs_instance = Jobs.Jobs(url, token) # initialize a jobs_instance
runs_job_id = jobs_instance.runJob(****************, 'notebook',
params) # **** is the job id
run_is_not_completed = True
while run_is_not_completed:
current_run = [run for run in jobs_instance.runsList('completed')['runs'] if run['run_id'] == runs_job_id['run_id'] and run['number_in_job'] == runs_job_id['number_in_job']]
if len(current_run) == 0:
time.sleep(30)
else:
run_is_not_completed = False
current_run = current_run[0]
print( f"Result state: {current_run['state']['result_state']}, You can check the resulted output in the following link: {current_run['run_page_url']}")
note_output = jobs_instance.runsGetOutput(runs_job_id['run_id'])['notebook_output']
return note_output
run_ks_job_and_return_output( { 'parm1' : 'george',
'variable': "values1"})
If you want to run the job many times in parallel you can do the following. (first be sure that you have increased the max concurent runs in the job settings)
from multiprocessing.pool import ThreadPool
pool = ThreadPool(1000)
results = pool.map(lambda j: run_ks_job_and_return_output( { 'table' : 'george',
'variable': "values1",
'j': j}),
[str(x) for x in range(2,len(snapshots_list))])
There is also the possibility to save the whole html output but maybe you are not interested on that. In any case I will answer to that to another post on StackOverflow.
Hope it helps.
You can use following steps :
Note-01:
dbutils.widgets.text("foo", "fooDefault", "fooEmptyLabel")
dbutils.widgets.text("foo2", "foo2Default", "foo2EmptyLabel")
result = dbutils.widgets.get("foo")+"-"+dbutils.widgets.get("foo2")
def display():
print("Function Display: "+result)
dbutils.notebook.exit(result)
Note-02:
thislist = ["apple", "banana", "cherry"]
for x in thislist:
dbutils.notebook.run("Note-01 path", 60, {"foo": x,"foo2":'Azure'})

Azure Stream Analytics: ML Service function call in cloud job results in no output events

I've got a problem with an Azure Stream Analytics (ASA) job that should call an Azure ML Service function to score the provided input data.
The query was developed und tested in Visual Studio (VS) 2019 with the "Azure Data Lake and Stream Analytics Tools" Extension.
As input the job uses an Azure IoT-Hub and as output the VS local output for testing purposes (and later even with Blobstorage).
Within this environment everything works fine, the call to the ML Service function is successfull and it returns the desired response.
Using the same query, user-defined functions and aggregates like in VS in the cloud job, no output events are generated (with neither Blobstorage nor Power BI as output).
In the ML Webservice it can be seen, that ASA successfully calls the function, but somehow does not return any response data.
Deleting the ML function call from the query results in a successfull run of the job with output events.
For the deployment of the ML Webservice I tried the following (working for VS, no output in cloud):
ACI (1 CPU, 1 GB RAM)
AKS dev/test (Standard_B2s VM)
AKS production (Standard_D3_v2 VM)
The inference script function schema:
input: array
output: record
Inference script input schema looks like:
#input_schema('data', NumpyParameterType(input_sample, enforce_shape=False))
#output_schema(NumpyParameterType(output_sample)) # other parameter type for record caused error in ASA
def run(data):
response = {'score1': 0,
'score2': 0,
'score3': 0,
'score4': 0,
'score5': 0,
'highest_score': None}
And the return value:
return [response]
The ASA job subquery with ML function call:
with raw_scores as (
select
time, udf.HMMscore(udf.numpyfySeq(Sequence)) as score
from Sequence
)
and the UDF "numpyfySeq" like:
// creates a N x 18 size array
function numpyfySeq(Sequence) {
'use strict';
var transpose = m => m[0].map((x, i) => m.map(x => x[i]));
var array = [];
for (var feature in Sequence) {
if (feature != "time") {
array.push(Sequence[feature])
}
}
return transpose(array);
}
"Sequence" is a subquery that aggregates the data into sequences (arrays) with an user-defined aggregate.
In VS the data comes from the IoT-Hub (cloud input selected).
The "function signature" is recognized correctly in the portal as seen in the image: Function signature
I hope the provided information is sufficient and you can help me.
Edit:
The authentication for the Azure ML webservice is key-based.
In ASA, when selecting to use an "Azure ML Service" function, it will automatically detect and use the keys from the deployed ML model within the subscription and ML workspace.
Deployment code used (in this example for ACI, but looks nearly the same for AKS deployment):
from azureml.core.model import InferenceConfig, Model
from azureml.core.environment import Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import AciWebservice
ws = Workspace.from_config()
env = Environment(name='scoring_env')
deps = CondaDependencies(conda_dependencies_file_path='./deps')
env.python.conda_dependencies = deps
inference_config = InferenceConfig(source_directory='./prediction/',
entry_script='score.py',
environment=env)
deployment_config = AciWebservice.deploy_configuration(auth_enabled=True, cpu_cores=1,
memory_gb=1)
model = Model(ws, 'HMM')
service = Model.deploy(ws, 'hmm-scoring', models,
inference_config,
deployment_config,
overwrite=True,)
service.wait_for_deployment(show_output=True)
with conda_dependencies:
name: project_environment
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- python=3.7.5
- pip:
- sklearn
- azureml-core
- azureml-defaults
- inference-schema[numpy-support]
- hmmlearn
- numpy
- pip
channels:
- anaconda
- conda-forge
The code used in the score.py is just a regular score operation with the loaded models and formatting like so:
score1 = model1.score(data)
score2 = model2.score(data)
score3 = model3.score(data)
# Same scoring with model4 and model5
# scaling of the scores to a defined interval and determination of model that delivered highest score
response['score1'] = score1
response['score2'] = score2
# and so on

HOW-TO push/pull to/from Airflow X_COM with spark_task and pythonOperator?

I have a dag that creates a spark-task and executes a certain script located in a particular directory. There are two tasks like this. Both of these tasks need to receive the same ID generated in the DAG file before these tasks are executed. If I simply store and pass a value solely via the python script, the IDs are different, which is normal. So I am trying to push the value to XCOM with a PythonOperator and task.
I need to pull the values from XCOM and update a 'params' dictionary with that information in order to be able to pass it to my spark task.
Could you please help me, i am hitting my head in the wall and just can't figure it out.
I tried the following:
create a function just to retrieve the data from xcom and the return it. Assigned this function to the params variable, but doesn't work. I cannot return from a python function inside the DAG which uses the xcom_pull function
tried assigning an empty list and appending to it from the python function. and then the final list to provide directly to my spark task. Doesn't work either. Please help!
Thanks a lot in advance for any help related to this. I will need this value the same for this and multiple other spark tasks that may come into the same DAG file.
DAG FILE
import..
from common.base_tasks import spark_task
default_args = {
'owner': 'airflow',
'start_date': days_ago(1),
'email_on_failure': True,
'email_on_retry': False,
}
dag = DAG(
dag_id='dag',
default_args=default_args,
schedule_interval=timedelta(days=1)
)
log_level = "info"
id_info = {
"id": str(uuid.uuid1()),
"start_time": str(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S,%f'))
}
# this stores the value to XCOM successfully
def store_id(**kwargs):
kwargs['ti'].xcom_push(key='id_info', value=id_info)
store_trace_task = PythonOperator(task_id='store_id', provide_context=True, python_callable=store_id, dag=dag)
extra_config = {'log_level': log_level}
config = '''{"config":"data"}'''
params = {'config': config,'extra_config': json.dumps(extra_config}
# ---------- this doesn't work ----------
pars = []
pars.append(params)
def task1_pull_params(**kwargs):
tracing = kwargs['ti'].xcom_pull(task_ids='store_trace_task')
pars.append(tracing)
# params = {
# 'parsed_config': parsed_config,
# 'extra_config': json.dumps(extra_config),
# 'trace_data': tracing
# }
# return params # return pushes to xcom, xcom_push does the same
task1_pull_params = PythonOperator(task_id='task1_pull_params', provide_context=True, python_callable=task1_pull_params, dag=dag)
store_trace_task >> task1_pull_params
# returning value from the function and calling it to assign res to the params variable below also doesn't work
# params = task1_pull_params
# this prints only what's outside of the function, i.e. params
print("===== pars =====> ", pars)
pipeline_task1 = spark_task(
name='task1',
script='app.py',
params=params,
dag=dag
)
task1_pull_params >> pipeline_task1

Hazelcast Jet 0.6.1 - Dag Definition

The Hazelcast Jet prints the DAG definition on the console,once started
This converts the Pipeline definition to the DAG.
Here is a Pipeline definition.
private Pipeline buildPipeline() {
Pipeline p = Pipeline.create();
p.drawFrom(Sources.<String, Record>remoteMapJournal("record", getClientConfig(), START_FROM_OLDEST))
.addTimestamps((v) -> getTimeStamp(v), 3000)
.peek()
.groupingKey((v) -> Tuple2.tuple2(getUserID(v),getTranType(v)))
.window(WindowDefinition.sliding(SLIDING_WINDOW_LENGTH_MILLIS, SLIDE_STEP_MILLIS))
.aggregate(counting())
.map((v)-> getMapKey(v))
.drainTo(Sinks.remoteMap("Test", getClientConfig()));
return p;
}
and here is a DAG definition printed on console.
.vertex("remoteMapJournalSource(record)").localParallelism(1)
.vertex("sliding-window-step1").localParallelism(4)
.vertex("sliding-window-step2").localParallelism(4)
.vertex("map").localParallelism(4)
.vertex("remoteMapSink(Test)").localParallelism(1)
.edge(between("remoteMapJournalSource(record)", "sliding-window-step1").partitioned(?))
.edge(between("sliding-window-step1", "sliding-window-step2").partitioned(?).distributed())
.edge(between("sliding-window-step2", "map"))
.edge(between("map", "remoteMapSink(Test)"))
Is there any way to get the DAG definition with all the details like sliding window details, aggregation APIs etc ?
No, it's technically impossible. If you write a lambda (for example for a key extractor), there's no way to display the code that defined the lambda. The only way for you to get more information is to embed that information into the vertex name.
In Jet 0.7, this printout will be changed to the graphviz format so that you can copy-paste it to a tool and see the DAG as an image.

Python unknown number of commandline arguments in boto3

I am trying to add tags based on commandline arguments passed to python script something like below:
./snapshot-create.py --id abcd --key1 Env --value1 Test
The script is like below:
client = boto3.client('ec2')
response = client.create_tags(
Resources=[
ID,
],
Tags=[
{
'Key': 'key1',
'Value': 'value1'
},
]
)
I want to use --key1 and --values as Tags as above but the problem is that there could be more than one tags that need to be added like:
./snapshot-create.py --id abcd --key1 Env --value1 Test -key2 Loca --value2 US -key1 Size --value1 small ...
How would I use those key-values if their number of arguments is not fixed.
I don't mind using function or any other way than what I came up with.
One option would be loading a json string as a dictionary and iterating it when creating the tags.
For example, consider this invocation:
$ my_script.py --tags "{'tag1': 'value1', 'tag2': 'value2'}" --id i-1234567890 i-0987654321
and this code snippet:
import json
import boto3
import argparse
parser.add_argument('-t', '--tags', type=str)
parser.add_argument('-i', '--id', nargs='+')
args = parser.parse_args()
client = boto3.client('ec2')
def create_tags(key, value, resources, c):
c.create_tags(
Resources=
resources,
,
Tags=[
{
'Key': key,
'Value': value
},
]
)
my_tags = json.loads(args.tags) # {'tag1': 'value1', 'tag2': 'value2'}
resources = args.id # ['i-1234567890', 'i-0987654321']
for k, v in my_tags.items():
create_tags(k, v, resources, client)
This should cause instances i-1234567890 & i-0987654321 to be tagged with both tags tag1 and tag2 described in --tags above.
If you require a more dynamic interface for resources as well, consider adding it to the json as such:
{ 'instance_id': [{'tag_key': 'tag_value'} ... ] ... }
You can the take a single argument --tags which will contain a mapping of resources and tags, instead of the above example where resources is statically mapped to the tags.
Pretty sure there are better, more pythonic, solutions than this though - this is one viable solution.

Resources