Jenkins transfering 0 files via publishing over SSH - linux

I have read these 4 posts already:
Jenkins transferring 0 files using publish over SSH plugin
transferring 0 files using publish over SSH plugin in Jenkins
Jenkins, SSH plugin, 0 files transferred
Jenkins 0 files published after build
Our issue seems to have the most in commong with the first issue in that list.
We are transitioning from building our software (and the packages we need) from Windows to Linux. Setting up the linux build did work, however, the resulting archive is not transfered to our package server. Relevant console output:
SSH: Connecting from host [intern2]
SSH: Connecting with configuration [intern2] ...
SSH: EXEC: STDOUT/STDERR from command [conda index /srv/pkgsrv/conda-repo/linux-64/] ...
updating index in: /srv/pkgsrv/conda-repo/linux-64
SSH: EXEC: completed after 1,001 ms
SSH: Disconnecting configuration [intern2] ...
SSH: Transferred 0 file(s)
Finished: SUCCESS
The build config is:
Source files: conda-bld/linux-64/*.tar.bz2
Remove prefix: conda-bld/linux-64
Remote directory: conda-repo/linux-64/
Execute command: conda index /srv/pkgsrv/conda-repo/linux-64/
The remote directory already exists and jenkins has rights to write there. The same server configuration (apart from subdirs) is used for the windows builds and they are transfered correctly.
The Jenkins configuration says:
HOME /var/lib/jenkins
JENKINS_HOME /var/lib/jenkins
PWD /var/lib/jenkins
The directory we are building into is $HOME/conda-bld/linux-64. In there I can see the built .tar.bz2 files (a few successful builds already accumulated).
jenkins#intern2:~/conda-bld/linux-64$ ls
fonts-1-1.tar.bz2 qjsonrpc-dev-1.0-12.tar.bz2 qjsonrpc-dev-1.0-6.tar.bz2 qjsonrpc-dev-1.0-9.tar.bz2
<otherproject>-0.1-19_g6fe33e2.tar.bz2 qjsonrpc-dev-1.0-13.tar.bz2 qjsonrpc-dev-1.0-7.tar.bz2 repodata.json
qjsonrpc-dev-1.0-10.tar.bz2 qjsonrpc-dev-1.0-14.tar.bz2 qjsonrpc-dev-1.0-8.tar.bz2 repodata.json.bz2
Why isn't jenkins giving some kind of error if it doesn't copy? Is something wrong with how I specified the folders, because I can't figure out what? Where can I look for errors?
/edit: I looked at the Jenkins log and found
Dec 22, 2016 8:39:41 AM org.kohsuke.stapler.RequestImpl$TypePair convertJSON
WARNING: 'stapler-class' is deprecated: hudson.plugins.git.extensions.impl.RelativeTargetDirectory
Dec 22, 2016 8:39:41 AM org.kohsuke.stapler.RequestImpl$TypePair convertJSON
WARNING: 'stapler-class' is deprecated: hudson.tasks.Shell
Dec 22, 2016 8:39:41 AM org.kohsuke.stapler.RequestImpl$TypePair convertJSON
WARNING: 'stapler-class' is deprecated: jenkins.plugins.publish_over_ssh.BapSshPublisherPlugin
Dec 22, 2016 8:40:15 AM hudson.model.Run execute
INFO: qjsonrpc-linux #15 main build action completed: SUCCESS
I'll try updating the SSH Publish plugins and look if that helps.

The issue was that the files were built to different folders than the jenkins working directory because I forgot to set CONDA_BLD_PATH. However, when setting CONDA_BLD_PATH we ran into strange errors during building of the package.
Making absolute symlink root/lib64/libqjsonrpc.so.1.0 -> libqjsonrpc.so.1.0.99 relative
Traceback (most recent call last):
File "/usr/local/lib/miniconda/bin/conda-build", line 6, in <module>
sys.exit(conda_build.cli.main_build.main())
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/cli/main_build.py", line 242, in main
execute(sys.argv[1:])
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/cli/main_build.py", line 234, in execute
already_built=None, config=config)
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/api.py", line 77, in build
need_source_download=need_source_download, config=config)
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/build.py", line 1099, in build_tree
config=recipe_config)
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/build.py", line 799, in build
create_info_files(m, pkg_files, config=config, prefix=config.build_prefix)
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/build.py", line 399, in create_info_files
write_about_json(m, config)
File "/usr/local/lib/miniconda/lib/python2.7/site-packages/conda_build/build.py", line 305, in write_about_json
conda_info = subprocess.check_output([bin_path, 'info', '--json', '-s'])
File "/usr/local/lib/miniconda/lib/python2.7/subprocess.py", line 574, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['/usr/local/lib/miniconda/bin/conda', 'info', '--json', '-s']' returned non-zero exit status 1
Build step 'Execute shell' marked build as failure
We ultimately went with building to the wrong directories and then moving the files before they were published in the post build step.
mkdir -p conda-bld/linux-64
conda build src
then
mv /var/lib/jenkins/conda-bld/linux-64/qjsonrpc*.tar.bz2 conda-bld/linux-64
Not sure what is wrong with the build when setting the conda path, but it works now.

Related

MLflow saves models to relative place instead of tracking_uri

sorry if my question is too basic, but cannot solve it.
I am experimenting with mlflow currently and facing the following issue:
Even if I have set the tracking_uri, the mlflow artifacts are saved to the ./mlruns/... folder relative to the path from where I run mlfow run path/to/train.py (in command line). The mlflow server searches for the artifacts following the tracking_uri (mlflow server --default-artifact-root here/comes/the/same/tracking_uri).
Through the following example it will be clear what I mean:
I set the following in the training script before the with mlflow.start_run() as run:
mlflow.set_tracking_uri("file:///home/#myUser/#SomeFolders/mlflow_artifact_store/mlruns/")
My expectation would be that mlflow saves all the artifacts to the place I gave in the registry uri. Instead, it saves the artifacts relative to place from where I run mlflow run path/to/train.py, i.e. running the following
/home/#myUser/ mlflow run path/to/train.py
creates the structure:
/home/#myUser/mlruns/#experimentID/#runID/artifacts
/home/#myUser/mlruns/#experimentID/#runID/metrics
/home/#myUser/mlruns/#experimentID/#runID/params
/home/#myUser/mlruns/#experimentID/#runID/tags
and therefore it doesn't find the run artifacts in the tracking_uri, giving the error message:
Traceback (most recent call last):
File "train.py", line 59, in <module>
with mlflow.start_run() as run:
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/fluent.py", line 204, in start_run
active_run_obj = client.get_run(existing_run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/client.py", line 151, in get_run
return self._tracking_client.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/client.py", line 57, in get_run
return self.store.get_run(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 524, in get_run
run_info = self._get_run_info(run_id)
File "/home/#myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 544, in _get_run_info
"Run '%s' not found" % run_uuid, databricks_pb2.RESOURCE_DOES_NOT_EXIST
mlflow.exceptions.MlflowException: Run '788563758ece40f283bfbf8ba80ceca8' not found
2021/07/23 16:54:16 ERROR mlflow.cli: === Run (ID '788563758ece40f283bfbf8ba80ceca8') failed ===
Why is that so? How can I change the place where the artifacts are stored, this directory structure is created? I have tried mlflow run --storage-dir here/comes/the/path, setting the tracking_uri, registry_uri. If I run the /home/path/to/tracking/uri mlflow run path/to/train.py it works, but I need to run the scripts remotely.
My endgoal would be to change the artifact uri to an NFS drive, but even in my local computer I cannot do the trick.
Thanks for reading it, even more thanks if you suggest a solution! :)
Have a great day!
This issue was solved by the following:
I have mixed the tracking_uri with the backend_store_uri.
The tracking_uri is where the MLflow related data (e.g. tags, parameters, metrics, etc.) are saved, which can be a database. On the other hand, the artifact_location is where the artifacts (other, not MLflow related data belonging to the preprocessing/training/evaluation/etc. scripts).
What led me to mistakes is that by running mlflow server from command line one should set up for the --backend-store-uri the tracking_uri (also in the script by setting the mlflow.set_tracking_uri()) and for --default-artifact-location the location of the artifacts. Somehow I didn't get that the tracking_uri = backend_store_uri.
Here's my solution
Launch the server
mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD#DB_ENDPOINT:5432/DB_NAME --default-artifact-root s3://S3_BUCKET_NAME
Set the the tracking uri to an HTTP URI like
mlflow.set_tracking_uri("http://my-tracking-server:5000/")

Docker firewall issue with cBioportal

we are sitting behind a firewall and try to run a docker image (cBioportal). The docker itself could be installed with a proxy but now we encounter the following issue:
Starting validation...
INFO: -: Unable to read xml containing cBioPortal version.
DEBUG: -: Requesting cancertypes from portal at 'http://cbioportal-container:8081'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error occurred during validation step:
Traceback (most recent call last):
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4491, in request_from_portal_api
response.raise_for_status()
File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://cbioportal-container:8081/api-legacy/cancertypes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/metaImport.py", line 127, in <module>
exitcode = validateData.main_validate(args)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4969, in main_validate
portal_instance = load_portal_info(server_url, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4622, in load_portal_info
parsed_json = request_from_portal_api(path, api_name, logger)
File "/cbioportal/core/src/main/scripts/importer/validateData.py", line 4495, in request_from_portal_api
) from e
ConnectionError: Failed to fetch metadata from the portal at [http://cbioportal-container:8081/api-legacy/cancertypes]
Now we know that it is a firewall issue, because it works when we install it outside the firewall. But we do not know how to change the firewall yet. Our idea was to look up the files and lines which throw the errors. But we do not know how to look into the files since they are within the docker.
So we can not just do something like
vim /cbioportal/core/src/main/scripts/importer/validateData.py
...because ... there is nothing. Of course we know this file is within the docker image, but like i said we dont know how to look into it. At the moment we do not know how to solve this riddle - any help appreciated.
maybe you still might need this.
You can access this python file within the container by usingdocker-compose exec cbioportal sh or docker-compose exec cbioportal bash
Then you can us cd, cat, vi, vim or else to access the given path in your post.
I'm not sure which command you're actually running but when I did the import call like
docker-compose run --rm cbioportal metaImport.py -u http://cbioportal:8080 -s study/lgg_ucsf_2014/lgg_ucsf_2014/ -o
I had to replace the http://cbioportal:8080 with the servers ip address.
Also notice that the studies path is one level deeper than in the official documentation.
In cbioportal behind proxy the study import is only available in offline mode via:
First you need to get inside the container
docker exec -it cbioportal-container bash
Then generate portal info folder
cd $PORTAL_HOME/core/src/main/scripts ./dumpPortalInfo.pl $PORTAL_HOME/my_portal_info_folder
Then import the study offline. -o is important to overwrite despite of warnings.
cd $PORTAL_HOME/core/src/main/scripts
./importer/metaImport.py -p $PORTAL_HOME/my_portal_info_folder -s /study/lgg_ucsf_2014 -v -o
Hope this helps.

WARNING: Can't exec "git": No such file or directory at /home/git/bin/lib/Gitolite/Common.pm line 152

git clone git#127.0.0.1:gitolite-admin.git
Cloning into 'gitolite-admin'...
WARNING: Can't exec "git": No such file or directory at /home/git/bin/lib/Gitolite/Common.pm line 152, <DATA> line 1.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
This comes from sitaramc/gitolite/src/lib/Gitolite/Common.pm, and can be seen for instance in this issue.
You need to check the PATH, with the Git installation folder in it: /path/to/Git/bin.

Why does command on GCE work on console terminal but not as a Cron job

I can run this command on my instance using web console;
gsutil rsync -d -r /my-path gs://my-bucket
But when I try on my remote ssh terminal I get this error;
root#instance-2: gsutil rsync -d -r /my-path gs://my-bucket
Building synchronization state...
INFO 0923 12:48:48.572446 multistore_file.py] Error decoding credential, skipping
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/multistore_file.py", line 381, in _refresh_data_cache
(key, credential) = self._decode_credential_from_json(cred_entry)
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/multistore_file.py", line 400, in _decode_credential_from_json
credential = Credentials.new_from_json(json.dumps(cred_entry['credential']))
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/client.py", line 292, in new_from_json
return from_json(s)
File "/usr/lib/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/credentials_lib.py", line 356, in from_json
data['token_expiry'], oauth2client.client.EXPIRY_FORMAT)
TypeError: must be string, not None
Caught non-retryable exception while listing gs://my-bucket/: Could not reach metadata service: Not Found
At source listing 10000...
At source listing 20000...
At source listing 30000...
At source listing 40000...
CommandException: Caught non-retryable exception - aborting rsync
I solved this by switching the user to the default CGE one that is created when the project is created. Root on the VM does not have privileges to run gsutil commands it seems.

Switching the update channel on Firefox Flame fails

I tried to follow the steps to change the update channel described here: Switch to nightly update channel. But the phone won't reboot after executing change_channel.sh because the scripts fails with
$ ./change_channel.sh -v aurora
adbd is already running as root
remount succeeded
cannot stat '/tmp/channel-prefs/updates.js': No such file or directory
Currently I have B2G 21.0.0.0-prerelease installed from here.
I you open the script and read the line 57, there is
cat >$TMP_DIR/updates.js <<UPDATES
If it fails to create the file in that directory, he won't be able to push it when doing adb push:
$ADB push $TMP_DIR/updates.js $B2G_PREF_DIR/updates.js
So check your permissions or change the temp directory to let your script create the updates.js file,
TMP_DIR=/tmp/channel-prefs

Resources