How can I set run_name in mlflow command line? - mlflow

MLFlow version: 1.4.0
Python version: 3.7.4
I'm running the UI as mlflow server... with all the required command line options.
I am logging to MLFlow as an MLFlow project, with the appropriate MLproject.yaml file. The project is being run on a Docker container, so the CMD looks like this:
mlflow run . -P document_ids=${D2V_DOC_IDS} -P corpus_path=... --no-conda --experiment-name=${EXPERIMENT_NAME}
Running the experiment like this results in a blank run_name. I know there's a run_id but I'd also like to see the run_name and set it in my code -- either in the command line, or in my code as mlflow.log.....
I've looked at Is it possible to set/change mlflow run name after run initial creation? but I want to programmatically set the run name instead of changing it manually on the UI.

One of the parameters to mlflow.start_run() is run_name. This would give you programmatic access to set the run name with each iteration. See the docs here.
Here's an example:
from datetime import datetime
## Define the name of our run
name = "this run is gonna be bananas" + datetime.now()
## Start a new mlflow run and set the run name
with mlflow.start_run(run_name = name):
## ...train model, log metrics/params/model...
## End the run
mlflow.end_run()
If you want to include set the name as part of an MLflow Project, you'll have to specify it as a parameter in the entry points to the project. This is located in in the MLproject file. Then you can pass those values into the mlflow.start_run() function from the command line.

for CLI, this seems to now be available:
--run-name <runname>
https://mlflow.org/docs/latest/cli.html#cmdoption-mlflow-run-run-name

Related

How to get `run_id` when using MLflow Project

When using MLflow Projects (via an MLproject file) I get this message at starting time:
INFO mlflow.projects.backend.local:
=== Running command 'source /anaconda3/bin/../etc/profile.d/conda.sh &&
conda activate mlflow-4736797b8261ec1b3ab764c5060cae268b4c8ffa 1>&2 &&
python3 main.py' in run with ID 'e2f0e8c670114c5887963cd6a1ac30f9' ===
I want to access the run_id shown above (e2f0e8c670114c5887963cd6a1ac30f9) from inside the main script.
I expected a run to be active but:
mlflow.active_run()
> None
Initiating a run inside the main script does give me access the correct run_id, although any subsequent runs will have a different run_id.
# first run inside the script - correct run_id
with mlflow.start_run():
print(mlflow.active_run().info.run_id)
> e2f0e8c670114c5887963cd6a1ac30f9
# second run inside the script - wrong run_id
with mlflow.start_run():
print(mlflow.active_run().info.run_id)
> 417065241f1946b98a4abfdd920239b1
Seems like a strange behavior, and I was wondering if there's another way to access the run_id assigned at the beginning of the MLproject run?
with mlflow.start_run() as run:
print(run.info.run_id)

How to read environment variables from YAML file in docker?

I'm having a .env file that contains variables
VALUE_A=10
NAME=SWARNA
I'm having an env.yml file.
Information:
value: $VALUE_A
name: $NAME
I'm having a python file envpy.py
from envyaml import EnvYAML
# read file env.yml and parse config
env = EnvYAML('env.yml')
print(env['Information']['value'])
print(env['Information']['name'])
Here is my Dockerfile
FROM python:3
ADD envpy.py /
ADD env.yml /
RUN pip install envyaml
CMD [ "python", "./envpy.py" ]
Expected output:
10
SWARNA
But I got :
VALUE_A
NAME
I'm using the commands to build the docker and run:
docker build -t python-env .
docker run python-env
How to print the values. Please correct me or suggest me where I'm doing wrong. Thank you.
.env is a docker-compose thing, which defines default values for environment variables to be interpolated into docker-compose.yml, and only there. They are not available anywhere else and certainly not inside your image.
You can make the values available as environment variables inside your image by copying .env into the image and in your Python code do
from dotenv import load_dotenv
load_dotenv()
(which requires you to install dotenv).
Also mind that there are probably better ways to achieve what you want:
If the values must be set at build time, you'd rather interpolate them into the resulting file at build time and copy the file with the hardcoded values into the image.
If the values should be overridable at runtime, just define them via ENV with a default value inside the Dockerfile.

how to prepare python code to run from command line on different environment?

how to run python(behave)code with environment parameters? e.g.
environment=X behave --tags #regression
what I have till now is
#given(u'user is on the firts page')
def step_impl(context):
context.first_page = FirstPage(context)
context.first_page.goto(url_config.URL["X env"])
and as dict URL
URL = {
"X env": "https://...",
"Y env": "https://..."
}
You should use environment variables for this. The pipeline script should include the following command to define which environment you want to run against:
export ENV=X_env
In your test scrip, get the environment variable and use that to get the appropriate url:
import os
#given(u'user is on the firts page')
def step_impl(context):
context.first_page = FirstPage(context)
execute_in_environment = os.environ.get("ENV")
context.first_page.goto(url_config.URL[execute_in_environment])
Note that reading the environment variable - so this line: execute_in_environment = os.environ.get("ENV") is typically done at a higher level in the test framework, somewhere along with other config stuff. But going strictly by what is shared in the question I have added it to the step implementation, which isn't best practice.
If you want to try it out on your Windows station first, then set the environment variable in the CMD prompt using:
set ENV=X_env
So to run your tests against a specific environment you would run these commands (this is a Linux example):
export ENV=X_env
behave --tags #regression

How to get logs from inside the container executed using DockerOperator?(Airflow)

I'm facing logging issues with DockerOperator.
I'm running a python script inside the docker container using DockerOperator and I need airflow to spit out the logs from the python script running inside the container. Airlfow is marking the job as success but the script inside the container is failing and I have no clue of what is going as I cannot see the logs properly. Is there way to set up logging for DockerOpertor apart from setting up tty option to True as suggested in docs
It looks like you can have logs pushed to XComs, but it's off by default. First, you need to pass xcom_push=True for it to at least start sending the last line of output to XCom. Then additionally, you can pass xcom_all=True to send all output to XCom, not just the first line.
Perhaps not the most convenient place to put debug information, but it's pretty accessible in the UI at least either in the XCom tab when you click into a task or there's a page you can list and filter XComs (under Browse).
Source: https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L112-L117 and https://github.com/apache/airflow/blob/1.10.10/airflow/operators/docker_operator.py#L248-L250
Instead of DockerOperator you can use client.containers.run and then do the following:
with DAG(dag_id='dag_1',
default_args=default_args,
schedule_interval=None,
tags=['my_dags']) as dag:
#task(task_id='task_1')
def start_task(**kwargs):
# get the docker params from the environment
client = docker.from_env()
# run the container
response = client.containers.run(
# The container you wish to call
image='__container__:latest',
# The command to run inside the container
command="python test.py",
version='auto',
auto_remove=True,
stdout = True,
stderr=True,
tty=True,
detach=True,
remove=True,
ipc_mode='host',
network_mode='bridge',
# Passing the GPU access
device_requests=[
docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
],
# Give the proper system volume mount point
volumes=[
'src:/src',
],
working_dir='/src'
)
output = response.attach(stdout=True, stream=True, logs=True)
for line in output:
print(line.decode())
return str(response)
test = start_task()
Then in your test.py script (in the docker container) you have to do the logging using the standard Python logging module:
import logging
logger = logging.getLogger("airflow.task")
logger.info("Log something.")
Reference: here

Using the Environment Class with Pipeline Runs

I am using an estimator step for a pipeline using the Environment class, in order to have a custom Docker image as I need some apt-get packages to be able to install a specific pip package. It appears from the logs that it's completely ignoring, unlike the non-pipeline version of the estimator, the docker portion of the environment variable. Very simply, this seems broken :
I'm running on SDK v1.0.65, and my dockerfile is completely ignored, I'm using
FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc
in the base_dockerfile property of my code.
Here's a snippet of my code :
from azureml.core import Environment
from azureml.core.environment import CondaDependencies
conda_dep = CondaDependencies()
conda_dep.add_pip_package('pymssql==2.1.1')
myenv = Environment(name="mssqlenv")
myenv.python.conda_dependencies=conda_dep
myenv.docker.enabled = True
myenv.docker.base_dockerfile = 'FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc'
myenv.docker.base_image = None
This works well when I use an Estimator by itself, but if I insert this estimator in a Pipeline, it fails. Here's my code to launch it from a Pipeline run:
from azureml.pipeline.steps import EstimatorStep
sql_est_step = EstimatorStep(name="sql_step",
estimator=est,
estimator_entry_script_arguments=[],
runconfig_pipeline_params=None,
compute_target=cpu_cluster)
from azureml.pipeline.core import Pipeline
from azureml.core import Experiment
pipeline = Pipeline(workspace=ws, steps=[sql_est_step])
pipeline_run = exp.submit(pipeline)
When launching this, the logs for the container building service reveal:
FROM continuumio/miniconda3:4.4.10... etc.
Which indicates it's ignoring my FROM mcr.... statement in the Environment class I've associated with this Estimator, and my pip install fails.
Am I missing something? Is there a workaround?
I can confirm that this is a bug on the AML Pipeline side. Specifically, the runconfig property environment.docker.base_dockerfile is not being passed through correctly in pipeline jobs. We are working on a fix. In the meantime, you can use the workaround from this thread of building the docker image first and specifying it with environment.docker.base_image (which is passed through correctly).
I found a workaround for now, which is to build your own Docker image. You can do this by using these options of the DockerSection of the Environment :
myenv.docker.base_image_registry.address = '<your_acr>.azurecr.io'
myenv.docker.base_image_registry.username = '<your_acr>'
myenv.docker.base_image_registry.password = '<your_acr_password>'
myenv.docker.base_image = '<your_acr>.azurecr.io/testimg:latest'
and use obviously whichever docker image you built and pushed to the container registry linked to the Azure Machine Learning Workspace.
To create the image, you would run something like this at the command line of a machine that can build a linux based container (like a Notebook VM):
docker build . -t <your_image_name>
# Tag it for upload
docker tag <your_image_name:latest <your_acr>.azurecr.io/<your_image_name>:latest
# Login to Azure
az login
# login to the container registry so that the push will work
az acr login --name <your_acr>
# push the image
docker push <your_acr>.azurecr.io/<your_image_name>:latest
Once the image is pushed, you should be able to get that working.
I also initially used EstimatorStep for custom images, but recently have figured out how to successfully pass Environment's first to RunConfiguration's, then to PythonScriptStep's. (example below)
Another workaround similar to your workaround would be to publish your custom docker image to Docker hub, then the param, docker_base_image becomes the URI, in our case mmlspark:0.16.
def get_environment(env_name, yml_path, user_managed_dependencies, enable_docker, docker_base_image):
env = Environment(env_name)
cd = CondaDependencies(yml_path)
env.python.conda_dependencies = cd
env.python.user_managed_dependencies = user_managed_dependencies
env.docker.enabled = enable_docker
env.docker.base_image = docker_base_image
return env
spark_env = f.get_environment(env_name='spark_env',
yml_path=os.path.join(os.getcwd(), 'compute/aml_config/spark_compute_dependencies.yml'),
user_managed_dependencies=False, enable_docker=True,
docker_base_image='microsoft/mmlspark:0.16')
# use pyspark framework
spark_run_config = RunConfiguration(framework="pyspark")
spark_run_config.environment = spark_env
roll_step = PythonScriptStep(
name='rolling window',
script_name='roll.py',
arguments=['--input_dir', joined_data,
'--output_dir', rolled_data,
'--script_dir', ".",
'--min_date', '2015-06-30',
'--pct_rank', 'True'],
compute_target=compute_target_spark,
inputs=[joined_data],
outputs=[rolled_data],
runconfig=spark_run_config,
source_directory=os.path.join(os.getcwd(), 'compute', 'roll'),
allow_reuse=pipeline_reuse
)
A couple of other points (that may be wrong):
PythonScriptStep is effectively a wrapper for ScriptRunConfig, which takes run_config as an argument
Estimator is a wrapper for ScriptRunConfig where RunConfig settings are made available as parameters
IMHO EstimatorStep shouldn't exist because it is better to define Env's and Steps separately instead of at the same time in one call.

Resources