we are sitting behind a firewall and try to run a docker image (cBioportal). The docker itself could be installed with a proxy but now we encounter the following issue:
Starting validation...
INFO: -: Unable to read xml containing cBioPortal version.
DEBUG: -: Requesting cancertypes from portal at 'http://cbioportal-container:8081'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error occurred during validation step:
Traceback (most recent call last):
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4491, in request_from_portal_api
response.raise_for_status()
File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://cbioportal-container:8081/api-legacy/cancertypes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/metaImport.py", line 127, in <module>
exitcode = validateData.main_validate(args)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4969, in main_validate
portal_instance = load_portal_info(server_url, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4622, in load_portal_info
parsed_json = request_from_portal_api(path, api_name, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4495, in request_from_portal_api
) from e
ConnectionError: Failed to fetch metadata from the portal at [http://cbioportal-container:8081/api-legacy/cancertypes]
Now we know that it is a firewall issue, because it works when we install it outside the firewall. But we do not know how to change the firewall yet. Our idea was to look up the files and lines which throw the errors. But we do not know how to look into the files since they are within the docker.
So we can not just do something like
vim /cbioportal/core/src/main/scripts/importer/validateData.py
...because ... there is nothing. Of course we know this file is within the docker image, but like i said we dont know how to look into it. At the moment we do not know how to solve this riddle - any help appreciated.
maybe you still might need this.
You can access this python file within the container by usingdocker-compose exec cbioportal sh or docker-compose exec cbioportal bash
Then you can us cd, cat, vi, vim or else to access the given path in your post.
I'm not sure which command you're actually running but when I did the import call like
docker-compose run --rm cbioportal metaImport.py -u http://cbioportal:8080 -s study/lgg_ucsf_2014/lgg_ucsf_2014/ -o
I had to replace the http://cbioportal:8080 with the servers ip address.
Also notice that the studies path is one level deeper than in the official documentation.
In cbioportal behind proxy the study import is only available in offline mode via:
First you need to get inside the container
docker exec -it cbioportal-container bash
Then generate portal info folder
cd $PORTAL_HOME/core/src/main/scripts ./dumpPortalInfo.pl $PORTAL_HOME/my_portal_info_folder
Then import the study offline. -o is important to overwrite despite of warnings.
cd $PORTAL_HOME/core/src/main/scripts
./importer/metaImport.py -p $PORTAL_HOME/my_portal_info_folder -s /study/lgg_ucsf_2014 -v -o
Hope this helps.
Related
I'm passing env var via docker container run in github actions like so:
run: docker container run -d -e MY_KEY="some key" -p 3000:3000 somedockerimage/somedockerimage:0.0.2
I know it should pass it right way because it working with node.js
in the python file:
import os
api_key = os.environ['MY_KEY']
print(api_key)
the results I get:
File "print.py", line 4, in <module>
api_key = os.environ['MY_KEY']
File "/usr/lib/python3.8/os.py", line 675, in __getitem__
raise KeyError(key) from None
KeyError: 'MY_KEY'
I don't see anything incorrect with the way you're running your Docker container. It's difficult to say without seeing the rest of the project, but you may need to delete your .pyc files with something like find . -name \*.pyc -delete. This answer could add more context.
sorry if my question is too basic, but cannot solve it.
I am experimenting with mlflow currently and facing the following issue:
Even if I have set the tracking_uri, the mlflow artifacts are saved to the ./mlruns/... folder relative to the path from where I run mlfow run path/to/train.py (in command line). The mlflow server searches for the artifacts following the tracking_uri (mlflow server --default-artifact-root here/comes/the/same/tracking_uri).
Through the following example it will be clear what I mean:
I set the following in the training script before the with mlflow.start_run() as run:
mlflow.set_tracking_uri("file:///home/#myUser/#SomeFolders/mlflow_artifact_store/mlruns/")
My expectation would be that mlflow saves all the artifacts to the place I gave in the registry uri. Instead, it saves the artifacts relative to place from where I run mlflow run path/to/train.py, i.e. running the following
/home/#myUser/ mlflow run path/to/train.py
creates the structure:
/home/#myUser/mlruns/#experimentID/#runID/artifacts
/home/#myUser/mlruns/#experimentID/#runID/metrics
/home/#myUser/mlruns/#experimentID/#runID/params
/home/#myUser/mlruns/#experimentID/#runID/tags
and therefore it doesn't find the run artifacts in the tracking_uri, giving the error message:
Traceback (most recent call last):
File "train.py", line 59, in <module>
with mlflow.start_run() as run:
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/fluent.py", line 204, in start_run
active_run_obj = client.get_run(existing_run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/client.py", line 151, in get_run
return self._tracking_client.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/client.py", line 57, in get_run
return self.store.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 524, in get_run
run_info = self._get_run_info(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 544, in _get_run_info
"Run '%s' not found" % run_uuid, databricks_pb2.RESOURCE_DOES_NOT_EXIST
mlflow.exceptions.MlflowException: Run '788563758ece40f283bfbf8ba80ceca8' not found
2021/07/23 16:54:16 ERROR mlflow.cli: === Run (ID '788563758ece40f283bfbf8ba80ceca8') failed ===
Why is that so? How can I change the place where the artifacts are stored, this directory structure is created? I have tried mlflow run --storage-dir here/comes/the/path, setting the tracking_uri, registry_uri. If I run the /home/path/to/tracking/uri mlflow run path/to/train.py it works, but I need to run the scripts remotely.
My endgoal would be to change the artifact uri to an NFS drive, but even in my local computer I cannot do the trick.
Thanks for reading it, even more thanks if you suggest a solution! :)
Have a great day!
This issue was solved by the following:
I have mixed the tracking_uri with the backend_store_uri.
The tracking_uri is where the MLflow related data (e.g. tags, parameters, metrics, etc.) are saved, which can be a database. On the other hand, the artifact_location is where the artifacts (other, not MLflow related data belonging to the preprocessing/training/evaluation/etc. scripts).
What led me to mistakes is that by running mlflow server from command line one should set up for the --backend-store-uri the tracking_uri (also in the script by setting the mlflow.set_tracking_uri()) and for --default-artifact-location the location of the artifacts. Somehow I didn't get that the tracking_uri = backend_store_uri.
Here's my solution
Launch the server
mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD#DB_ENDPOINT:5432/DB_NAME --default-artifact-root s3://S3_BUCKET_NAME
Set the the tracking uri to an HTTP URI like
mlflow.set_tracking_uri("http://my-tracking-server:5000/")
I recently ported my scripts from 2.x to 3.x. During production runs through automation (rundeck) we are seeing errors caused by the logger not handling blocking I/O. Any ideas how to resolve would be great.
Ubuntu 18.04.1 LTS
Python 3.6.7
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.6/logging/__init__.py", line 998, in emit
self.flush()
File "/usr/lib/python3.6/logging/__init__.py", line 978, in flush
self.stream.flush()
BlockingIOError: [Errno 11] write could not complete without blocking
I was getting the same error on CI builds. It looks like it was a capacity issue with the output stream. After reducing the log output, the errors went away.
I recently faced the error while building my Docker image using docker-compose in CI and I found one Workaround maybe that will help someone:
the Error:
BlockingIOError: [Errno 11] write could not complete without blocking
if you do not want to lose any logs , you can send all the logs to file and save it as an artifact , tested on Bamboo and Jenkins:
docker-compose build --no-cache my_image > myfile.txt
if you do not want the logs:
docker-compose build --no-cache my_image > /dev/null
I am trying to launch Presto by entering the following in the terminal:
sudo bin/launcher start
It shows me this:
Started as 16501 (This integer varies on every attempt)
Then, I tried to launch it by entering the following in terminal:
sudo bin/launcher run --verbose
The output I get is:
config_path = /media/polly/161813A518138343/PrestoDB/presto-server- 0.203/etc/config.properties
data_dir = /media/polly/161813A518138343/PrestoDB/presto-server-0.203
etc_dir = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc
install_path = /media/polly/161813A518138343/PrestoDB/presto-server-0.203
jvm_config = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/jvm.config
launcher_config = /media/polly/161813A518138343/PrestoDB/presto-server- 0.203/bin/launcher.properties
launcher_log = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/log/launcher.log
log_levels = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/log.properties
log_levels_set = False
node_config = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/node.properties
pid_file = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/run/launcher.pid
properties = {}
server_log = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/log/server.log
verbose = True
['java', '-cp', '/media/polly/161813A518138343/PrestoDB/presto-server- 0.203/lib/*', '-server', '-Xmx16G', '-XX:+UseG1GC', '-XX:G1HeapRegionSize=32M', '-XX:+UseGCOverheadLimit', '-XX:+ExplicitGCInvokesConcurrent', '-XX:+HeapDumpOnOutOfMemoryError', '-XX:+ExitOnOutOfMemoryError', '-Dconfig=/media/polly/161813A51813834/PrestoDB/presto-server-0.203/etc/config.properties', 'com.facebook.presto.server.PrestoServer']
Traceback (most recent call last):
File "bin/launcher.py", line 445, in main
handle_command(command, o)
File "bin/launcher.py", line 329, in handle_command
run(process, options)
File "bin/launcher.py", line 251, in run
os.execvpe(args[0], args, env)
File "/usr/lib/python2.7/os.py", line 355, in execvpe
_execvpe(file, args, env)
File "/usr/lib/python2.7/os.py", line 382, in _execvpe
func(fullname, *argrest)
OSError: [Errno 2] No such file or directory
I am unable to understand the error message. Any help would be appreciated.
Here is the config.properties file:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=3306
query.max-memory=2GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:3306
EDIT: After entering sudo bin/launcher start into the terminal and then sudo bin/launcher status , it says "Not running". Also there is no web page at localhost:3306. If it started successfully, then I should get a web page.
Since I got it fixed myself, I will answer my own question for anyone who encounters this issue in future and comes across this question.
Where exactly was the problem: JRE. (Thanks to kokosing for pointing out that there might be some problem with java)
What I did before: I downloaded jre-8u171-linux-x64.tar.gz from https://java.com/en/download/help/linux_x64_install.xml, placed it in a partition or "media" different from where ubuntu is installed. I configured the .bashrc myself and added the following lines:
JAVA_HOME=/media/polly/161813A518138343/Java/jdk-10.0.1
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export JRE_HOME
export PATH
For changes to take place, I executed exec bash in terminal.
To check if it was running I tried java -version and it displayed the version of java running.
I tried to launch Presto, it wouldn't run.
What I did after: I removed the part that I had added to .bashrc.
I used the command sudo apt-get install default-jre. After successful installation I entered java -version and it showed me the version of java installed and running. I tried to launch presto and it ran successfully. I am able to see the page at localhost:3360.
Commands sudo bin/launcher start and sudo bin/launcher run conflicts with each other. First starts Presto in background while second starts Presto in foreground. You cannot start two Presto processes on the same machine because they try to allocate the same port (see your config.properties http-server.http.port=3306).
What did you want to achieve with sudo bin/launcher run? If you want to run a query then please use presto-cli-*-executable.jar*
I can run this command on my instance using web console;
gsutil rsync -d -r /my-path gs://my-bucket
But when I try on my remote ssh terminal I get this error;
root#instance-2: gsutil rsync -d -r /my-path gs://my-bucket
Building synchronization state...
INFO 0923 12:48:48.572446 multistore_file.py] Error decoding credential, skipping
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/multistore_file.py", line 381, in _refresh_data_cache
(key, credential) = self._decode_credential_from_json(cred_entry)
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/multistore_file.py", line 400, in _decode_credential_from_json
credential = Credentials.new_from_json(json.dumps(cred_entry['credential']))
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/client.py", line 292, in new_from_json
return from_json(s)
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/credentials_lib.py", line 356, in from_json
data['token_expiry'], oauth2client.client.EXPIRY_FORMAT)
TypeError: must be string, not None
Caught non-retryable exception while listing gs://my-bucket/: Could not reach metadata service: Not Found
At source listing 10000...
At source listing 20000...
At source listing 30000...
At source listing 40000...
CommandException: Caught non-retryable exception - aborting rsync
I solved this by switching the user to the default CGE one that is created when the project is created. Root on the VM does not have privileges to run gsutil commands it seems.