How to fix 'incorrect artifact/model path on HDFS showing on MLflow server' - mlflow

I run a mlflow server with the following command using mlflow, version 1.2.0
mlflow server --host myhost -p myport --backend-store-uri mysql://user#localhost/mlflow --default-artifact-root hdfs://myhost/user/myid/mlflow_test
I run the experiment from MLflow tutorial quickstart https://www.mlflow.org/docs/latest/quickstart.html
the command:
mlflow run sklearn_elasticnet_wine -P alpha=0.5 --no-conda
the code to log the model is
mlflow.sklearn.log_model(lr, "model")
in
https://github.com/mlflow/mlflow/blob/master/examples/sklearn_elasticnet_wine/train.py
I visit the server by webbrowser myhost: myport and check the run I ran.
I successfully get the ran info by myhost: myport/#/experiments/0/runs/run_id
in this page, i found that the first layer (model directory) path is correct. that is, run_id/artifacts/model
correct path
but once I click the MLmodel file under model folder, the path get wrong:
I expect to see run_id/artifacts/model/MLmodel
but actually it was run_id/artifacts/MLmodel
wrong path

Related

Newbie: Can't connect to database using browser

Recently installed Neo4J on a Raspberry Pi on a docker container (portainer). Everything seems to working fine. I can open a terminal in Portainer and run commands. I can see there are two DB and I can even run Cypher commands (cut and pasted the Movie entries). But I'm not able to run any commands using the browser. I seem to be able to connect the browser (http://localhost:7474/browser/) and see the ":play movie-graph" run. But when I try to run the query to enter movie data, I get the following error: "ERROR: Neo.DatabaseError.General.UnknownError" Running :sysinfo doesn't return any results. Also the cursor is $ as opposed to a DB name. And don't see any databases in the Database menu on the left.
Again, I'm able to run queries using Cypher Shell through a Portainer terminal.
Here are the container details:
IMAGE neo4j:latest#sha256:b91a4a85afb0cec9892522436bbbcb20f1d6d026c8c24cafcbcc4e27b5c8b68d
CMD neo4j
ENTRYPOINT tini -g -- /startup/docker-entrypoint.sh
ENV
JAVA_HOME /usr/local/openjdk-11
JAVA_VERSION 11.0.15
LANG C.UTF-8
NEO4J_AUTH none
NEO4J_dbms_connector_bolt_advertised__address localhost:7687
NEO4J_dbms_connector_http_advertised__address localhost:7474
NEO4J_dbms_connector_https_advertised__address localhost:7473
NEO4J_EDITION community
NEO4J_HOME /var/lib/neo4j
NEO4J_SHA256 34c8ce7edc2ab9f63a204f74f37621cac3427f12b0aef4c6ef47eaf4c2b90d66
NEO4J_TARBALL neo4j-community-4.4.8-unix.tar.gz
PATH /var/lib/neo4j/bin:/usr/local/openjdk-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
I'm sure it is something silly I'm missing, but reading multiple forum comments haven't help. Any help would be appreciated.
Thank you.
SJ
When you load the neo4jbrowser from your internet browser at:
http://localhost:7474/browser
, you need to connect to the bolt advertised address of your neo4j server at:
bolt://localhost:7687

Mlflow - empty artifact folder

All,
I started the mlflow server as below. I do see the backend store containing the expected metadata. However, the artifact folder is empty despite many runs.
> mlflow server --backend-store-uri mlflow_db --default-artifact-root
> ./mlflowruns --host 0.0.0.0 --port 5000
The mlflow ui has the below message for the artifacts section:
No Artifacts Recorded
Use the log artifact APIs to store file outputs from MLflow runs.
What am I doing wrong?
Thanks,
grajee
Turns out that
"--backend-store-uri mlflow_db" was pointing to D:\python\Pythonv395\Scripts\mlflow_db
and
"--default-artifact-root ./mlflowruns" was pointing to D:\DataEngineering\MlFlow\Wine Regression\mlflowruns which is the project folder.
I was able to point both the output to one folder with the following syntax
file:/D:/DataEngineering/MlFlow/Wine Regression
In case you want to log artifacts to your server with local file system as object storage, you should specify --serve-artifact --artifact-destination file:/path/to/your/desired/location instead of just a vanilla path.

Why is MLFLow unable to log metrics, artifacts while using MLFlow Project in Docker environment?

I am trying to store metrics and artifacts on host after running MLProject in a docker environment.I am expecting that when the experiment completes successfully, artifacts, metrics folders in mlruns/ folder should have values and be shown on mlflow ui but artifacts, metrics folders in mlruns/ folder are empty. mlflow ui is also not reflecting the new experiment.
/home/mlflow_demo/mlflow-demo.py -
import mlflow
from mlflow.tracking import MlflowClient
from random import random
import pickle
client = MlflowClient()
experiment_id = client.create_experiment(name='first experiment')
run = client.create_run(experiment_id=experiment_id)
for i in range(1000):
client.log_metric(run.info.run_id,"foo",random(),step=i)
with open("test.txt","w") as f:
f.write("This is an artifact file")
client.log_artifact(run.info.run_id,"test.txt")
client.set_terminated(run.info.run_id)
/home/mlflow_demo/MLProject -
name: test-project
docker_env:
image: kusur/apex-pytorch-image:latest
entry_points:
main:
command: "python mlflow-demo.py"
command (executed in /home/mlflow_demo): - mlflow run .
After running the above code, I get the following log -
2021/07/06 12:22:28 INFO mlflow.projects.docker: === Building docker image test-project ===
2021/07/06 12:22:28 INFO mlflow.projects.utils: === Created directory /home/mlflow_demo/mlruns/tmpwa8ydc5j for downloading remote URIs passed to arguments of type 'path' ===
2021/07/06 12:22:28 INFO mlflow.projects.backend.local: === Running command 'docker run --rm -v /home/mlflow_demo/mlruns:/mlflow/tmp/mlruns -v /home/mlflow_demo/mlruns/0/0978fdd89ba44bf7b49975ab84838e82/artifacts:/home/mlflow_demo/mlruns/0/0978fdd89ba44bf7b49975ab84838e82/artifacts -e MLFLOW_RUN_ID=0978fdd89ba44bf7b49975ab84838e82 -e MLFLOW_TRACKING_URI=file:///mlflow/tmp/mlruns -e MLFLOW_EXPERIMENT_ID=0 test-project:latest python mlflow-demo.py' in run with ID '0978fdd89ba44bf7b49975ab84838e82' ===
...
2021/07/06 12:22:33 INFO mlflow.projects: === Run (ID '0978fdd89ba44bf7b49975ab84838e82') succeeded ===
Still the folders mlruns/0/0978fdd89ba44bf7b49975ab84838e82/artifacts and mlruns/0/0978fdd89ba44bf7b49975ab84838e82/metrics are empty.
Can someone please provide the pointers. Please let me know if the question isn't well framed.

MLflow - Serving model by reference to model registry

I'm having an issue to serve a model with reference to model registry. According to help, the path should look like this:
models:/model_name/stage
When I type in terminal:
mlflow models serve -m models:/ml_test_model1/Staging --no-conda -h 0.0.0.0 -p 5003
I got the error:
mlflow.exceptions.MlflowException: Not a proper models:/ URI: models:/ml_test_model1/Staging/MLmodel. Models URIs must be of the form 'models:/<model_name>/<version or stage>'.
Model is registered and visible in db and server.
If I put absolute path, it works (experiment_id/run_id/artifacts/model_name).
mlflow version: 1.4
Python version: 3.7.3
Is it matter of some environmental settings or something different?
That style of referencing model artefacts is fixed from mlflow v1.5 (Bug Fix).
You'll need to run mlflow db upgrade <db uri> to refresh your schemas before restarting your mlflow server.
You may find listing registered models helpful:
<server>:<port>/api/2.0/preview/mlflow/registered-models/list
setting the env solved this for me:
export MLFLOW_TRACKING_URI=http://localhost:5000
mlflow models serve models:/my_clf_model/Staging -p 1234 -h 0.0.0.0 --no-conda

Subprocess can't find file when executed from a Python file in Docker container

I have created a simple Flask app which I am trying to deploy to Docker.
The basic user interface will load on localhost, but when I execute a command which calls a specific function, it keeps showing:
"Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."
Looking at Docker logs I can see the problem is that the file cannot be found by the subprocess.popen command:
"FileNotFoundError: [Errno 2] No such file or directory: 'test_2.bat': 'test_2.bat'
172.17.0.1 - - [31/Oct/2019 17:01:55] "POST /login HTTP/1.1" 500"
The file certainly exists in the Docker environment, within the container I can see it listed in the root directory.
I have also tried changing:
item = subprocess.Popen(["test_2.bat", i], shell=False,stdout=subprocess.PIPE)
to:
item = subprocess.Popen(["./test_2.bat", i], shell=False,stdout=subprocess.PIPE)
which generated the alternative error:
"OSError: [Errno 8] Exec format error: './test_2.bat'
172.17.0.1 - - [31/Oct/2019 16:58:54] "POST /login HTTP/1.1" 500"
I have added a shebang to the top of both .py files involved in the Flask app (although I may have done this wrong):
#!/usr/bin/env python3
and this is the Dockerfile:
FROM python:3.6
RUN adduser lighthouse
WORKDIR /home/lighthouse
COPY requirements.txt requirements.txt
# RUN python -m venv venv
RUN pip install -r requirements.txt
RUN pip install gunicorn
COPY templates templates
COPY json_logs_nl json_logs_nl
COPY app.py full_script_manual_with_list.py schema_all.json ./
COPY bq_load_indv_jsons_v3.bat test_2.bat ./
RUN chmod 644 app.py
RUN pip install flask
ENV FLASK_APP app.py
RUN chown -R lighthouse:lighthouse ./
USER lighthouse
# EXPOSE 5000
CMD ["flask", "run", "--host=0.0.0.0"]
I am using Ubuntu and WSL2 to run Docker on a Windows machine without a virtual box. I have no trouble navigating my Windows file system or building Docker images so I think this configuration is not the problem - but just in case.
If anyone has any ideas to help subprocess locate test_2.bat I would be very grateful!
Edit: the app works exactly as expected when executed locally via the command line with "flask run"
If anyone is facing a similar problem, the solution was to put the command directly into the Python script rather than calling it in a separate file. It is split into separate strings to allow the "url" variable to be iteratively updated, as this all occurs within a for loop:
url = str(i)
var_command = "lighthouse " + url + " --quiet --chrome-flags=\" --headless\" --output=json output-path=/home/lighthouse/result.json"
item = subprocess.Popen([var_command], stdout=subprocess.PIPE, shell=True)
item.communicate()
As a side note, if you would like to run Lighthouse within a container you need to install it just as you would to run it on the command line, in a Node container. This container can then communicate with my Python container if both deployed in the same pod via Kubernetes and share a namespace. Here is a Lighthouse container Dockerfile I've used: https://github.com/GoogleChromeLabs/lighthousebot/blob/master/builder/Dockerfile

Resources