Traceback while trying to launch Presto - presto

I am trying to launch Presto by entering the following in the terminal:
sudo bin/launcher start
It shows me this:
Started as 16501 (This integer varies on every attempt)
Then, I tried to launch it by entering the following in terminal:
sudo bin/launcher run --verbose
The output I get is:
config_path = /media/polly/161813A518138343/PrestoDB/presto-server- 0.203/etc/config.properties
data_dir = /media/polly/161813A518138343/PrestoDB/presto-server-0.203
etc_dir = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc
install_path = /media/polly/161813A518138343/PrestoDB/presto-server-0.203
jvm_config = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/jvm.config
launcher_config = /media/polly/161813A518138343/PrestoDB/presto-server- 0.203/bin/launcher.properties
launcher_log = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/log/launcher.log
log_levels = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/log.properties
log_levels_set = False
node_config = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/etc/node.properties
pid_file = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/run/launcher.pid
properties = {}
server_log = /media/polly/161813A518138343/PrestoDB/presto-server-0.203/var/log/server.log
verbose = True
['java', '-cp', '/media/polly/161813A518138343/PrestoDB/presto-server- 0.203/lib/*', '-server', '-Xmx16G', '-XX:+UseG1GC', '-XX:G1HeapRegionSize=32M', '-XX:+UseGCOverheadLimit', '-XX:+ExplicitGCInvokesConcurrent', '-XX:+HeapDumpOnOutOfMemoryError', '-XX:+ExitOnOutOfMemoryError', '-Dconfig=/media/polly/161813A51813834/PrestoDB/presto-server-0.203/etc/config.properties', 'com.facebook.presto.server.PrestoServer']
Traceback (most recent call last):
File "bin/launcher.py", line 445, in main
handle_command(command, o)
File "bin/launcher.py", line 329, in handle_command
run(process, options)
File "bin/launcher.py", line 251, in run
os.execvpe(args[0], args, env)
File "/usr/lib/python2.7/os.py", line 355, in execvpe
_execvpe(file, args, env)
File "/usr/lib/python2.7/os.py", line 382, in _execvpe
func(fullname, *argrest)
OSError: [Errno 2] No such file or directory
I am unable to understand the error message. Any help would be appreciated.
Here is the config.properties file:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=3306
query.max-memory=2GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:3306
EDIT: After entering sudo bin/launcher start into the terminal and then sudo bin/launcher status , it says "Not running". Also there is no web page at localhost:3306. If it started successfully, then I should get a web page.

Since I got it fixed myself, I will answer my own question for anyone who encounters this issue in future and comes across this question.
Where exactly was the problem: JRE. (Thanks to kokosing for pointing out that there might be some problem with java)
What I did before: I downloaded jre-8u171-linux-x64.tar.gz from https://java.com/en/download/help/linux_x64_install.xml, placed it in a partition or "media" different from where ubuntu is installed. I configured the .bashrc myself and added the following lines:
JAVA_HOME=/media/polly/161813A518138343/Java/jdk-10.0.1
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export JRE_HOME
export PATH
For changes to take place, I executed exec bash in terminal.
To check if it was running I tried java -version and it displayed the version of java running.
I tried to launch Presto, it wouldn't run.
What I did after: I removed the part that I had added to .bashrc.
I used the command sudo apt-get install default-jre. After successful installation I entered java -version and it showed me the version of java installed and running. I tried to launch presto and it ran successfully. I am able to see the page at localhost:3360.

Commands sudo bin/launcher start and sudo bin/launcher run conflicts with each other. First starts Presto in background while second starts Presto in foreground. You cannot start two Presto processes on the same machine because they try to allocate the same port (see your config.properties http-server.http.port=3306).
What did you want to achieve with sudo bin/launcher run? If you want to run a query then please use presto-cli-*-executable.jar*

Related

MLflow saves models to relative place instead of tracking_uri

sorry if my question is too basic, but cannot solve it.
I am experimenting with mlflow currently and facing the following issue:
Even if I have set the tracking_uri, the mlflow artifacts are saved to the ./mlruns/... folder relative to the path from where I run mlfow run path/to/train.py (in command line). The mlflow server searches for the artifacts following the tracking_uri (mlflow server --default-artifact-root here/comes/the/same/tracking_uri).
Through the following example it will be clear what I mean:
I set the following in the training script before the with mlflow.start_run() as run:
mlflow.set_tracking_uri("file:///home/#myUser/#SomeFolders/mlflow_artifact_store/mlruns/")
My expectation would be that mlflow saves all the artifacts to the place I gave in the registry uri. Instead, it saves the artifacts relative to place from where I run mlflow run path/to/train.py, i.e. running the following
/home/#myUser/ mlflow run path/to/train.py
creates the structure:
/home/#myUser/mlruns/#experimentID/#runID/artifacts
/home/#myUser/mlruns/#experimentID/#runID/metrics
/home/#myUser/mlruns/#experimentID/#runID/params
/home/#myUser/mlruns/#experimentID/#runID/tags
and therefore it doesn't find the run artifacts in the tracking_uri, giving the error message:
Traceback (most recent call last):
File "train.py", line 59, in <module>
with mlflow.start_run() as run:
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/fluent.py", line 204, in start_run
active_run_obj = client.get_run(existing_run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/client.py", line 151, in get_run
return self._tracking_client.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/client.py", line 57, in get_run
return self.store.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 524, in get_run
run_info = self._get_run_info(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 544, in _get_run_info
"Run '%s' not found" % run_uuid, databricks_pb2.RESOURCE_DOES_NOT_EXIST
mlflow.exceptions.MlflowException: Run '788563758ece40f283bfbf8ba80ceca8' not found
2021/07/23 16:54:16 ERROR mlflow.cli: === Run (ID '788563758ece40f283bfbf8ba80ceca8') failed ===
Why is that so? How can I change the place where the artifacts are stored, this directory structure is created? I have tried mlflow run --storage-dir here/comes/the/path, setting the tracking_uri, registry_uri. If I run the /home/path/to/tracking/uri mlflow run path/to/train.py it works, but I need to run the scripts remotely.
My endgoal would be to change the artifact uri to an NFS drive, but even in my local computer I cannot do the trick.
Thanks for reading it, even more thanks if you suggest a solution! :)
Have a great day!
This issue was solved by the following:
I have mixed the tracking_uri with the backend_store_uri.
The tracking_uri is where the MLflow related data (e.g. tags, parameters, metrics, etc.) are saved, which can be a database. On the other hand, the artifact_location is where the artifacts (other, not MLflow related data belonging to the preprocessing/training/evaluation/etc. scripts).
What led me to mistakes is that by running mlflow server from command line one should set up for the --backend-store-uri the tracking_uri (also in the script by setting the mlflow.set_tracking_uri()) and for --default-artifact-location the location of the artifacts. Somehow I didn't get that the tracking_uri = backend_store_uri.
Here's my solution
Launch the server
mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD#DB_ENDPOINT:5432/DB_NAME --default-artifact-root s3://S3_BUCKET_NAME
Set the the tracking uri to an HTTP URI like
mlflow.set_tracking_uri("http://my-tracking-server:5000/")

Strange Behavior with clamd scan function

I have a simple python3 script running on ubuntu server 20.04 that tries to call clamd (clamav-daemon process) library to scan a file. The scan ping() and version() function all work correctly. However when I actually do a test write and scan, i get the following error:
{'/filedrop/test.doc': ('ERROR', "Can't open file or directory")}
This is the code that I used to call the test write and scan, and this is all standard sample from the clamd website:
open('/filedrop/test.doc','wb').write(clamd.EICAR)
print(cd.scan('/filedrop/test.doc'))
After the code is run, i get the following string in the test file which indicates that the python3 script was able to successfully write to the file, yet i keep getting the error that the file can't be opened when i use the clamd scan function.
This is the string that was written to the file:
X5O!P%#AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
I am also able to run clamscan from command line on the folder and it successfully scans the files as well.
I'm running as root user while the service is using clamav:clamav.
I did give read/write permission to the folder and the files to "other users", and also indicated by the fact that the file could be written by the python script.
I believe the solution to the problem here is that AppArmour is blocking clamd for that particular directory. I would look at the AppArmour profile for clamd. It should be called something like /etc/apparmor.d/clamav or similar. You can adjust that profile or alternatively disable it (according to Ubuntu):
sudo ln -s /etc/apparmor.d/profile.name /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/profile.name
More complete instructions available here:
https://help.ubuntu.com/community/AppArmor
You can also disable AppArmour, for the purposes of testing (I don't like to advise anyone to remove security features permanently), with:
sudo systemctl stop apparmor
sudo systemctl disable apparmor

Docker firewall issue with cBioportal

we are sitting behind a firewall and try to run a docker image (cBioportal). The docker itself could be installed with a proxy but now we encounter the following issue:
Starting validation...
INFO: -: Unable to read xml containing cBioPortal version.
DEBUG: -: Requesting cancertypes from portal at 'http://cbioportal-container:8081'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error occurred during validation step:
Traceback (most recent call last):
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4491, in request_from_portal_api
response.raise_for_status()
File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://cbioportal-container:8081/api-legacy/cancertypes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/metaImport.py", line 127, in <module>
exitcode = validateData.main_validate(args)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4969, in main_validate
portal_instance = load_portal_info(server_url, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4622, in load_portal_info
parsed_json = request_from_portal_api(path, api_name, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4495, in request_from_portal_api
) from e
ConnectionError: Failed to fetch metadata from the portal at [http://cbioportal-container:8081/api-legacy/cancertypes]
Now we know that it is a firewall issue, because it works when we install it outside the firewall. But we do not know how to change the firewall yet. Our idea was to look up the files and lines which throw the errors. But we do not know how to look into the files since they are within the docker.
So we can not just do something like
vim /cbioportal/core/src/main/scripts/importer/validateData.py
...because ... there is nothing. Of course we know this file is within the docker image, but like i said we dont know how to look into it. At the moment we do not know how to solve this riddle - any help appreciated.
maybe you still might need this.
You can access this python file within the container by usingdocker-compose exec cbioportal sh or docker-compose exec cbioportal bash
Then you can us cd, cat, vi, vim or else to access the given path in your post.
I'm not sure which command you're actually running but when I did the import call like
docker-compose run --rm cbioportal metaImport.py -u http://cbioportal:8080 -s study/lgg_ucsf_2014/lgg_ucsf_2014/ -o
I had to replace the http://cbioportal:8080 with the servers ip address.
Also notice that the studies path is one level deeper than in the official documentation.
In cbioportal behind proxy the study import is only available in offline mode via:
First you need to get inside the container
docker exec -it cbioportal-container bash
Then generate portal info folder
cd $PORTAL_HOME/core/src/main/scripts ./dumpPortalInfo.pl $PORTAL_HOME/my_portal_info_folder
Then import the study offline. -o is important to overwrite despite of warnings.
cd $PORTAL_HOME/core/src/main/scripts
./importer/metaImport.py -p $PORTAL_HOME/my_portal_info_folder -s /study/lgg_ucsf_2014 -v -o
Hope this helps.

sqlite3 module cannot open database when started from ubuntu upstart

Working in Ubuntu server 14.04
I have an upstart .conf file in /etc/init for staring my node server. I am using forever. Here is what my script looks like
start on filesystem or runlevel [2345]
expect fork
setuid myUserId
env HOME=/home/myUserId/
env NODE_BIN_DIR=/usr/bin
env NODE_PATH=/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript
script
PATH=$NODE_BIN_DIR:$NODE_PATH:$PATH
echo $PATH
exec forever start -o /home/myUserId/nodeServ/lServer/logs/out.log /home/myUserId/nodeServ/lServer/server.js 1337
end script
But I keep getting this error
Error: SQLITE_CANTOPEN: unable to open database file
error: Forever detected script exited with code: 8
If I run the script from the command line exactly as it is in the conf file it works just fine. No problems. So I think it is a permissions issue. I have set permissions for read write execute on the database directory and the database and still I am unable to read from the file.
I have tried so many different things and I cannot figure out why this is happneing
UPDATE: This problem appears to not be isolated to upstart. I tried staring forever in shell script as well and I get the same errors.
I resolved my issue via workaround. Not using forever and starting node directly from the upstart file (allowing respawn). No issues. This appears to either be a sqlite3 issue or a forever issue.
write this command sudo chown www-data. . in db file directory.
other solution is check file exists or not
ex :
var fs = require("fs");
var exists = fs.existsSync(dbfilepath);

Why is my startup script not running

Per various tutorials I've done the following:
created a file called ftpserver.py in /home/root/
created a file in /etc/init.d/ called ftpserver that looks like this"
#!/bin/sh
python /home/root/ftpserver.py
Upon creation, I ran the following (to make it executable, apparently)
root#beaglebone1:/etc/init.d# chmod +x ftpserver
But it doesn't appear to be running on startup. However if I run the following command:
root#beaglebone1:/etc/init.d# /etc/init.d/ftpserver
Then the script runs, exectuing ftpserver.py.
Interestingly, if I try to run ftpserver from within it's directory in the following manner (not sure if this is relevant):
root#beaglebone1:/etc/init.d# ftpserver
It returns:
-sh: ftpserver: command not found
So I'm not certain why my script isn't running on startup.
For reference, ftpserver.py looks like this:
from pyftpdlib import ftpserver
authorizer = ftpserver.DummyAuthorizer()
authorizer.add_user("root", "12345", "/home/root", perm="elradfmw")
handler = ftpserver.FTPHandler
handler.authorizer = authorizer
address = ("", 21)
ftpd = ftpserver.FTPServer(address, handler)
ftpd.serve_forever(
Try running it with ./ftpserver
Also, check if your script is configured to run in current runlevel - probably /etc/rc.conf and there DAEMONS or something like that.

Resources