MlflowException: Could not find a registered artifact repository for: mlflow-artifacts - mlflow

Trying to use MLflow, Pycaret and Dagshub. I followed this tutorial. And I the experiments appear at the MLflow server. However, something around the artifacts URI seems to be missing. When I run my pycaret experiment, I get
from pycaret.classification import *
s = setup(data, target = 'Churn', session_id = 123, ignore_features = ['customerID'], log_experiment = True, experiment_name = 'churn1', silent=True)
# model training and selection
best = compare_models()
MlflowException: Could not find a registered artifact repository for: mlflow-artifacts:/.../.../artifacts. Currently registered schemes are: ['', 'file', 's3', 'gs', 'wasbs', 'ftp', 'sftp', 'dbfs', 'hdfs', 'viewfs', 'runs', 'models']
And so far, I did not find any information on how to resolve this.
Versions
pycaret: '2.3.4'
MLflow: '1.20.2'

Related

Append Existing Custom Node Group in Blender

I tried to append my blend file and import the node group in my new environment. I tried different methods on how to append my own custom node group in my environment, and still not working. Here's my boilerplate.
class SHADER(Operator):
bl_idname = "material.append_shader_nodes"
bl_label = "Add Shader"
bl_options = {'REGISTER', 'UNDO'}
def execute(self,context):
# Importing the blend file (working)
import_from_library('shader');
bpy.ops.object.material_slot_add()
# Creates new Material
npr_material = bpy.data.materials.new(name='SHADER')
npr_material.use_nodes = True
# Remove the default shader
npr_material.node_tree.nodes.remove(npr_material.node_tree.nodes.get('Principled BSDF'))
material_output = npr_material.node_tree.nodes.get('Material Output')
# Problem
SHADER = bpy.data.node_groups['NPREEVEE'] # Import my custom node group from my different blend file
# link shader to material
npr_material.node_tree.links.new(material_output.inputs[0], SHADER.outputs[0])
# set activer material to your new material
bpy.context.object.active_material = npr_material
return {'FINISHED'}
It seems like it didn't import my node group, but when I tried to manually add my custom node group, it displayed on my materials properties. I'm not totally familiar with this package. Is this is a bug or there is something that I missed while creating my node group?

400 Caller's project doesn't match parent project

I have this block of code that basically translates text from one language to another using the cloud translate API. The problem is that this code always throws the error: "Caller's project doesn't match parent project". What could be the problem?
translation_separator = "translated_text: "
language_separator = "detected_language_code: "
translate_client = translate.TranslationServiceClient()
# parent = translate_client.location_path(
# self.translate_project_id, self.translate_location
# )
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = (
os.getcwd()
+ "/translator_credentials.json"
)
# Text can also be a sequence of strings, in which case this method
# will return a sequence of results for each text.
try:
result = str(
translate_client.translate_text(
request={
"contents": [text],
"target_language_code": self.target_language_code,
"parent": f'projects/{self.translate_project_id}/'
f'locations/{self.translate_location}',
"model": self.translate_model
}
)
)
print(result)
except Exception as e:
print("error here>>>>>", e)
Your issue seems to be related to the authentication method that you are using on your application, please follow the guide for authention methods with the translate API. If you are trying to pass the credentials using code, you can explicitly point to your service account file in code with:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'service_account.json')
Also, there is a codelab for getting started with the translation API with Python, this is a great step by step getting started guide for running the translate API with Python.
If the issue persists, you can try creating a Public Issue Tracker for Google Support

Azure Stream Analytics: ML Service function call in cloud job results in no output events

I've got a problem with an Azure Stream Analytics (ASA) job that should call an Azure ML Service function to score the provided input data.
The query was developed und tested in Visual Studio (VS) 2019 with the "Azure Data Lake and Stream Analytics Tools" Extension.
As input the job uses an Azure IoT-Hub and as output the VS local output for testing purposes (and later even with Blobstorage).
Within this environment everything works fine, the call to the ML Service function is successfull and it returns the desired response.
Using the same query, user-defined functions and aggregates like in VS in the cloud job, no output events are generated (with neither Blobstorage nor Power BI as output).
In the ML Webservice it can be seen, that ASA successfully calls the function, but somehow does not return any response data.
Deleting the ML function call from the query results in a successfull run of the job with output events.
For the deployment of the ML Webservice I tried the following (working for VS, no output in cloud):
ACI (1 CPU, 1 GB RAM)
AKS dev/test (Standard_B2s VM)
AKS production (Standard_D3_v2 VM)
The inference script function schema:
input: array
output: record
Inference script input schema looks like:
#input_schema('data', NumpyParameterType(input_sample, enforce_shape=False))
#output_schema(NumpyParameterType(output_sample)) # other parameter type for record caused error in ASA
def run(data):
response = {'score1': 0,
'score2': 0,
'score3': 0,
'score4': 0,
'score5': 0,
'highest_score': None}
And the return value:
return [response]
The ASA job subquery with ML function call:
with raw_scores as (
select
time, udf.HMMscore(udf.numpyfySeq(Sequence)) as score
from Sequence
)
and the UDF "numpyfySeq" like:
// creates a N x 18 size array
function numpyfySeq(Sequence) {
'use strict';
var transpose = m => m[0].map((x, i) => m.map(x => x[i]));
var array = [];
for (var feature in Sequence) {
if (feature != "time") {
array.push(Sequence[feature])
}
}
return transpose(array);
}
"Sequence" is a subquery that aggregates the data into sequences (arrays) with an user-defined aggregate.
In VS the data comes from the IoT-Hub (cloud input selected).
The "function signature" is recognized correctly in the portal as seen in the image: Function signature
I hope the provided information is sufficient and you can help me.
Edit:
The authentication for the Azure ML webservice is key-based.
In ASA, when selecting to use an "Azure ML Service" function, it will automatically detect and use the keys from the deployed ML model within the subscription and ML workspace.
Deployment code used (in this example for ACI, but looks nearly the same for AKS deployment):
from azureml.core.model import InferenceConfig, Model
from azureml.core.environment import Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import AciWebservice
ws = Workspace.from_config()
env = Environment(name='scoring_env')
deps = CondaDependencies(conda_dependencies_file_path='./deps')
env.python.conda_dependencies = deps
inference_config = InferenceConfig(source_directory='./prediction/',
entry_script='score.py',
environment=env)
deployment_config = AciWebservice.deploy_configuration(auth_enabled=True, cpu_cores=1,
memory_gb=1)
model = Model(ws, 'HMM')
service = Model.deploy(ws, 'hmm-scoring', models,
inference_config,
deployment_config,
overwrite=True,)
service.wait_for_deployment(show_output=True)
with conda_dependencies:
name: project_environment
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- python=3.7.5
- pip:
- sklearn
- azureml-core
- azureml-defaults
- inference-schema[numpy-support]
- hmmlearn
- numpy
- pip
channels:
- anaconda
- conda-forge
The code used in the score.py is just a regular score operation with the loaded models and formatting like so:
score1 = model1.score(data)
score2 = model2.score(data)
score3 = model3.score(data)
# Same scoring with model4 and model5
# scaling of the scores to a defined interval and determination of model that delivered highest score
response['score1'] = score1
response['score2'] = score2
# and so on

I am using JIRA Python API to create Issue

I am using following code to create an issue :
from jira import JIRA
import pandas as pd
user = 'XXXXXXXXXXXXXXX#gmail.com'
apikey = 'XXXXXXXXXXXXXXXXXXXXXXX'
server = 'https://XXXXXXX.atlassian.net'
options = {'server': server}
jira = JIRA(options, basic_auth=(user,apikey) )
# summary = issue.fields.summary
issue_List=[]
readexcel=pd.read_excel(r'test1.xlsx')
for item in readexcel.index:
isssue_dict=dict()
isssue_dict['project']=dict({'key':'MYB'})
isssue_dict['summary']=readexcel['Summary'][item]
isssue_dict['description']=readexcel['Description'][item]
isssue_dict['issuetype']=dict({'name':'Bug'})
# isssue_dict['customfield_10014']=readexcel['Epic Link'][item]
isssue_dict['priority']={'name':readexcel['Priority'][item]}
isssue_dict['labels']=[readexcel['Labels'][item]]
isssue_dict['reporter'] :dict({'name':readexcel['Reporter'][item]})
isssue_dict['assignee']=[readexcel['Assignee'][item]]
new_issue = jira.create_issue(fields=isssue_dict)
print(new_issue._str_())
I am not able to put {versions:[{`Affects Version\s':Affects version}]} and Epic Link on Jira issue
In order to set the Epic information, please use add_issues_to_epic(epic_id, issue_keys, ignore_epics=True).
If the version already exists, use the keys versions and fixVersions to set the Affected Version and Fix Version.
See https://jira.readthedocs.io/en/master/api.html for a full documentation of these methods.

How to export metrics from a containerized component in kubeflow pipelines 0.2.5

I have a pipeline made up out of 3 containerized components. In the last component I write the metrics I want to a file named /mlpipeline-metrics.json, just like it's explained here.
This is the Python code I used.
metrics = {
'metrics': [
{
'name': 'accuracy',
'numberValue': accuracy,
'format': 'PERCENTAGE',
},
{
'name': 'average-f1-score',
'numberValue': average_f1_score,
'format': 'PERCENTAGE'
},
]
}
with open('/mlpipeline-metrics.json', 'w') as f:
json.dump(metrics, f)
I also tried writing the file with the following code, just like in the example linked above.
with file_io.FileIO('/mlpipeline-metrics.json', 'w') as f:
json.dump(metrics, f)
The pipeline runs just fine without any errors. But it won't show the metrics in the front-end UI.
I'm thinking it has something to do with the following codeblock.
def metric_op(accuracy, f1_scores):
return dsl.ContainerOp(
name='visualize_metrics',
image='gcr.io/mgcp-1190085-asml-lpd-dev/kfp/jonas/container_tests/image_metric_comp',
arguments=[
'--accuracy', accuracy,
'--f1_scores', f1_scores,
]
)
This is the code I use to create a ContainerOp from the containerized component. Notice I have not specified any file_outputs.
In other ContainerOp I have to specify file_outputs to be able to pass variables to the next steps in the pipeline. Should I do something similar here to map the /mlpipeline-metrics.json onto something so that kubeflow pipelines detects it?
I'm using a managed AI platform pipelines deployment running Kubeflow Pipelines 0.2.5 with Python 3.6.8.
Any help is appreciated.
So after some trial and error I finally came to a solution. And I'm happy to say that my intuition was right. It did have something to do with the file_outputs I didn't specify.
To be able to export your metrics you will have to set file_outputs as follows.
def metric_op(accuracy, f1_scores):
return dsl.ContainerOp(
name='visualize_metrics',
image='gcr.io/mgcp-1190085-asml-lpd-dev/kfp/jonas/container_tests/image_metric_comp',
arguments=[
'--accuracy', accuracy,
'--f1_scores', f1_scores,
],
file_outputs={
'mlpipeline-metrics': '/mlpipeline-metrics.json'
}
)
Here is another way of showing metrics when you write python functions based method:
# Define your components code as standalone python functions:======================
def add(a: float, b: float) -> NamedTuple(
'AddOutput',
[
('sum', float),
('mlpipeline_metrics', 'Metrics')
]
):
'''Calculates sum of two arguments'''
sum = a+b
metrics = {
'add_metrics': [
{
'name': 'sum',
'numberValue': float(sum),
}
]
}
print("Add Result: ", sum) # this will print it online in the 'main-logs' of each task
from collections import namedtuple
addOutput = namedtuple(
'AddOutput',
['sum', 'mlpipeline_metrics'])
return addOutput(sum, metrics) # the metrics will be uploaded to the cloud
Note: I am jsut using a basci function here. I am not using your function.

Resources