I'm using Gitlab self server community version ci/cd function, It has been running well, But suddenly one day,All the projects in the gitlab/cicd has failed, it mentions below errors:
Uploading artifacts for successful job
Uploading artifacts...
promotion-api/my-boot-module-system/target/*.jar: found 1 matching files and directories
WARNING: Uploading artifacts as "archive" to coordinator... failed id=1515 responseStatus=500 Internal Server Error status=500 token=xrDFnLeB
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... failed id=1515 responseStatus=500 Internal Server Error status=500 token=xrDFnLeB
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... failed id=1515 responseStatus=500 Internal Server Error status=500 token=xrDFnLeB
FATAL: invalid argument
ERROR: Job failed: exit code 1
Below is the code in .gitlab-ci.yml
deploy-java:
stage: deploy
dependencies:
- build-java
image:
name: docker/compose:latest
before_script:
- docker info
- docker-compose -v
script:
- cd promotion-api
- docker-compose build
- docker images
- docker ps -a
- docker-compose up -d
tags:
- promotion
2021-3-29:
I switched the runner from linux version to docker, seems all fine till now
You ran into a bug in GitLab itself. Instead of a rather obscure HTTP status 500 it should say what the issue is explicitly. It is likely not a problem in your repository or CI settings.
Follow these issues to find out more:
Artifact is stopped in the transfer, possibly because it takes longer than some timeout to upload: https://gitlab.com/gitlab-org/gitlab-runner/-/issues/26869
Artifact is larger than 1 GB: https://gitlab.com/gitlab-org/gitlab/-/issues/267111
For me the problem was that gitlab had run out of disk-space.
After having deleted unnecessary artifacts and files everything was working correctly again.
You can check your system status (if you host your own gitlab) here: www.gitlab-url.com/admin/system_info
(Go to: Admin Area > Monitoring > System Info)
Hope it helps!
Using Terraform v0.12.5 on windows 10 behind a corporate proxy
When I run
terraform init
Then I got
Initializing the backend...
Initializing provider plugins...
- Checking for available provider plugins...
Error verifying checksum for provider "helm"
The checksum for provider distribution from the Terraform Registry
did not match the source. This may mean that the distributed files
were changed after this version was released to the Registry.
...
Error: unable to verify checksum
For my teamates, it works fine (same terraform version but in a linux environment and using a VPN. No proxy hell for them).
I set the proxy as env variabes (HTTP_PROXY and HTTPS_PROXY).
If I don't set them, I've got
Error: error validating provider credentials: error calling sts:GetCallerIdentity: RequestError: send request failed
caused by: Post https://sts.amazonaws.com/: dial tcp --.--.--.--:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
instead.
I think this is a network problem but the error above is not verbose at all.
terraform init -reconfigure should work. It will re-download all the connectors and reset bad backend names, too.
We are trying to scan our docker images using Anchore Engine Jenkins plugin.
Currently we create our application docker images, push it in our own private local registry and then deploy it in our test environments.
Now, we want to setup docker image scanning in our CI/CD process to check for any vulnerabilities.
We have installed Anchore Engine using the recommended Docker-Compose yaml method given in the Documentation link:
https://anchore.freshdesk.com/support/solutions/articles/36000020729-install-on-docker-swarm
Post installation, we installed the
Anchore Container Image Scanner Plugin in Jenkins.
We configured the plugin as mentioned in the document link:
https://wiki.jenkins.io/display/JENKINS/Anchore+Container+Image+Scanner+Plugin
However, the scanning fails. Error Message as follows:
2018-10-11T07:01:44.647 INFO AnchoreWorker Analysis request accepted, received image digest sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8
2018-10-11T07:01:44.647 INFO AnchoreWorker Waiting for analysis of 10.180.25.2:5000/hello-world:latest, polling status periodically
2018-10-11T07:01:44.647 DEBUG AnchoreWorker anchore-engine get policy evaluation URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true
2018-10-11T07:01:44.648 DEBUG AnchoreWorker Attempting anchore-engine get policy evaluation (1/300)
2018-10-11T07:01:44.675 DEBUG AnchoreWorker anchore-engine get policy evaluation failed. URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true, status: HTTP/1.1 404 NOT FOUND, error: {
"detail": {},
"httpcode": 404,
"message": "image is not analyzed - analysis_status: not_analyzed"
}
NOTE:
In Image TAG 10.180.25.2:5000/hello-world:latest, 10.180.25.2:5000 is our local private registry and hello-world:latest is latest hello-world image available in docker hub which we pulled and pushed in our registry to try out image scanning using Anchore-Engine.
Unfortunately we are not able to find much resource online to try and resolve the above mentioned issue.
Anyone who might have worked on Anchore-Engine, please may I request to have a look and help us resolve this issue.
Also, any suggestions or alternatives to anchore-engine or detailed steps in case we might have missed anything would be really appreciated.
End of the output is as follows:
2018-10-15T00:48:43.880 WARN AnchoreWorker anchore-engine get policy evaluation failed. HTTP method: GET, URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true, status: 404, error: {
"detail": {},
"httpcode": 404,
"message": "image is not analyzed - analysis_status: not_analyzed"
}
2018-10-15T00:48:43.880 WARN AnchoreWorker Exhausted all attempts polling anchore-engine. Analysis is incomplete for sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8
2018-10-15T00:48:43.880 ERROR AnchorePlugin Failing Anchore Container Image Scanner Plugin step due to errors in plugin execution
hudson.AbortException: Timed out waiting for anchore-engine analysis to complete (increasing engineRetries might help). Check above logs for errors from anchore-engine
at com.anchore.jenkins.plugins.anchore.BuildWorker.runGatesEngine(BuildWorker.java:480)
at com.anchore.jenkins.plugins.anchore.BuildWorker.runGates(BuildWorker.java:343)
at com.anchore.jenkins.plugins.anchore.AnchoreBuilder.perform(AnchoreBuilder.java:338)
at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
I also checked status and found below:
docker run anchore/engine-cli:latest anchore-cli --u admin --p admin123 --url http://172.18.0.1:8228/v1 system status
Service analyzer (dockerhostid-anchore-engine, http://anchore-engine:8084): up
Service catalog (dockerhostid-anchore-engine, http://anchore-engine:8082): up
Service policy_engine (dockerhostid-anchore-engine, http://anchore-engine:8087): down (unavailable)
Service simplequeue (dockerhostid-anchore-engine, http://anchore-engine:8083): up
Service apiext (dockerhostid-anchore-engine, http://anchore-engine:8228): up
Service kubernetes_webhook (dockerhostid-anchore-engine, http://anchore-engine:8338): up
Engine DB Version: 0.0.7
Engine Code Version: 0.2.4
It seems service policy engine is down
Service policy_engine (dockerhostid-anchore-engine, http://anchore-engine:8087): down (unavailable)
I also checked the docker logs . I found below error:
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [DEBUG] service (policy_engine) starting in: 4
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [INFO] Registration complete.
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [INFO] Checking feeds client credentials
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [DEBUG] Initializing a feeds client
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [bootstrap] [DEBUG] init values: [None, None, None, (), None, None]
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [bootstrap] [DEBUG] using values: ['https://ancho.re/v1/service/feeds', 'https://ancho.re/oauth/token', 'https://ancho.re/v1/account/users', 'anon#ancho.re', 3, 60]
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [urllib3.connectionpool] [DEBUG] Starting new HTTPS connection (1): ancho.re
[service:policy_engine] 2018-10-15 09:37:50+0000 [-] [bootstrap] [ERROR] Preflight checks failed with error: HTTPSConnectionPool(host='ancho.re', port=443): Max retries exceeded with url: /v1/account/users/anon#ancho.re (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7ffa905f0b90>: Failed to establish a new connection: [Errno 113] No route to host',)). Aborting service startup
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/anchore_manager/cli/service.py", line 158, in startup_service
raise Exception("process exited: " + str(rc))
Exception: process exited: 1
[anchore-policy-engine] [anchore_manager.cli.service/startup_service()] [INFO] service process exited at (Mon Oct 15 09:37:50 2018): process exited: 1
[anchore-policy-engine] [anchore_manager.cli.service/startup_service()] [INFO] exiting service thread
Thanks and Regards,
Rohan Shetty
When images are added to anchore-engine, they are queued for analysis which moves them through a simple state machine that starts with ‘not_analyzed’, goes to ‘analyzing’ and finally ends in either ‘analyzed’ or ‘analysis_failed’. Only when an image has reached ‘analyzed’ will a policy evaluation be possible.
The anchore Jenkins plugin will add an image, then poll the engine for image status/evaluation for the configured number of tries (default 300). Once the image goes to ‘analyzed’ (where policy evaluation is possible), the plugin will then receive a policy evaluation result from the engine.
The plugin will fail the build (by default) if the max retries has been performed and the image has not reached ‘analyzed’, if the image does reach ‘analyzed’ but the policy evaluation is producing a ‘fail’ result (meaning the image didn’t pass your configured policy checks). Note that all build failure behavior can be controlled in the plugin (I.e. there are options to allow the plugin to succeed even if the analysis or image eval fails).
You’ll need to look at the end of the output from your build run (instead of just the beginning from your post), and combined with the information above, it should be clear which scenario is causing the plugin to fail the build.
We have resolved the issue.
Root Cause:
We were not able to establish a successful https connection to URL : https://ancho.re from within the anchore-engine docker container.
As a result the service:policy_engine was not able to start.
https://ancho.re is required to download policy feeds and sync-up periodically. Without these policy anchore-engine won't be able to analyse the docker images.
Solution:
1) We passed a HTTPS_PROXY URL as an environment variable in the docker-compose.yaml of anchore-engine.
We used this proxy URL to bypass restrictions in our environment and establish a connection with https://ancho.re url.
2) Restarted the docker containers.
Finally we got all services up and running including Anchore policy-engine.
FYI:
It takes a while to download all the required Feeds depending on your internet speed.
Lastly, Thanks to the Anchore community for quick responses and support over slack.
Hope this helps.
Warm Regards,
Rohan Shetty
I am getting the following error when deploying nodejs application to openshift origin from my git repository...
Pushing image 172.30.210.239:5000/nodev6/node3v6:latest ...
Pushed 0/9 layers, 1% complete
Pushed 1/9 layers, 12% complete
Pushed 2/9 layers, 24% complete
Pushed 3/9 layers, 34% complete
Pushed 4/9 layers, 47% complete
Pushed 5/9 layers, 58% complete
Pushed 6/9 layers, 71% complete
Pushed 7/9 layers, 88% complete
Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: serviceaccount#example.org
Registry server Password: <>
error: build error: Failed to push image: received unexpected HTTP status: 500 Internal Server Error
Don't know whats wrong with my docker container or openshift ...
what steps should i follow for the resolution... what installations needed or what commands needs to run...
I am attempting to run a job on a Spark cluster setup in Mesos. I can run a job if I copy the jar to the server and then use a file: URL, but I cannot get spark to download a jar using https:. Every time I do get the error below in the stderr file.
I0226 00:11:05.618361 22652 logging.cpp:172] INFO level logging started!
I0226 00:11:05.618552 22652 fetcher.cpp:409] Fetcher Info: ...
I0226 00:11:05.619721 22652 fetcher.cpp:364] Fetching URI 'https://jenkins.company.com/nexus/...
I0226 00:11:05.619738 22652 fetcher.cpp:238] Fetching directly into the sandbox directory
I0226 00:11:05.619751 22652 fetcher.cpp:176] Fetching URI 'https://jenkins.company.com/nexus/...
I0226 00:11:05.619762 22652 fetcher.cpp:126] Downloading resource from 'https://jenkins.company.com/nexus/...
Failed to fetch 'https://jenkins.company.com/nexus/... ': Error downloading resource: SSL connect error
Failed to synchronize with slave (it's probably exited)
I am able to use wget to download the jar from the specified URL. I have also verified that the JDK on the server has the correct certificate for the nexus server where I am attempting to download the jar from.
I am new to Spark and Mesos and any help resolving this issue would be greatly appreciated.
Did you specify you private Nexus repository with the --repository flag on application start?
I personally never use encryption together with Spark, but from the docs it seems to be possible/necessary to configure it within Spark. I guess just for the SDKs is not enough.
See
http://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
http://spark.apache.org/docs/latest/configuration.html#encryption