Argo CD timeout when creating an application - gitlab

Hello I setup a new ArgoCD instance, which I successfully connected to my private gitlab repository:
But it seems that it's timeouting when I try to create an application:
time="2021-12-26T10:28:21Z" level=error msg="finished unary call with code Internal" error="rpc error: code = Internal desc = Failed to fetch 77f31050a59a475378ae0dc712d5f84421760aac: `git fetch origin --tags --force` failed timeout after 1m30s" grpc.code=Internal grpc.method=GetAppDetails grpc.request.deadline="2021-12-26T10:01:29Z" grpc.service=repository.RepoServerService grpc.start_time="2021-12-26T10:00:29Z" grpc.time_ms=1.6724298e+06 span.kind=server system=grpc
Any idea on this issue? I know there's a bug with private gitlab repositories on a specific port, but I tried with both URLs (with and without the "ssh://") without success

Related

How to reconcile data automatically between gitaly nodes

I have deployed Gitlab with a separate praefect server and 3 gitaly nodes.
This setup works fine , I am facing issues when i replace an gitaly node with a new server. Data is not getting replicated from other gitaly nodes.
I tried using below command but i got error , as I am using tls connection between praefect and gitaly nodes.
sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml reconcile -virtual default -reference gitaly-2 -target gitaly-1 -f
Getting below error
unable to reconcile: failed to dial "localhost:2305" connection: context deadline exceeded
Looking for some pointers on how can I do automatic reconcilation of data between gitaly nodes

AWS elastic beanstalk deploy always fails (uploading a zipfile)

I upload a new version of my app as a zipfile and click deploy. Then the status changes to severe.
This is the error trace:
WARN
Environment health has transitioned from Info to Degraded. Command failed on all instances. Incorrect application version found on all instances. Expected version "Sample" (deployment 2). Application update failed 10 seconds ago and took 4 minutes.
ERROR
During an aborted deployment, some instances may have deployed the new application version. To ensure all instances are running the same version, re-deploy the appropriate application version.
ERROR
Failed to deploy application.
ERROR
Unsuccessful command execution on instance id(s) 'i------'. Aborting the operation.
ERROR
[Instance: i-002326d7ceeba0ea9] Command failed on instance. Return code:
1 Output: nginx: [emerg] no host in upstream ":80" in /etc/nginx/conf.d/elasticbeanstalk-nginx-docker-upstream.conf:
2 nginx: configuration file /etc/nginx/nginx.conf test failed Failed to start nginx, abort deployment.
Hook /opt/elasticbeanstalk/hooks/appdeploy/enact/01flip.sh failed.
For more detail, check /var/log/eb-activity.log using console or EB CLI.
ERROR
Failed to start nginx, abort deployment
/var/log/eb-activity.log
here are errors in this log:
[0mInstalling dependencies from Pipfile.lock (5e00f3)…
Failed to load paths: /bin/sh: 1: /root/.local/share/virtualenvs/app-lp47FrbD/bin/python: not found
...
[2020-05-29T01:51:24.746Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1/AppDeployEnactHook/00run.sh] : Completed activity. Result:
jq: error (at <stdin>:1): Cannot iterate over null (null)
a2f568b1c255eb9e0fdc6ceebdd29b9ec64b9ab4481a3e1c5bcb11828b0ac526
[2020-05-29T01:51:24.747Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1/AppDeployEnactHook/01flip.sh] : Starting activity...
[2020-05-29T01:51:26.099Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1/AppDeployEnactHook/01flip.sh] : Activity execution failed, because: nginx: [emerg] no host in upstream ":80" in /etc/nginx/conf.d/elasticbeanstalk-nginx-docker-upstream.conf:2
nginx: configuration file /etc/nginx/nginx.conf test failed
Failed to start nginx, abort deployment (ElasticBeanstalk::ExternalInvocationError)
caused by: nginx: [emerg] no host in upstream ":80" in /etc/nginx/conf.d/elasticbeanstalk-nginx-docker-upstream.conf:2
nginx: configuration file /etc/nginx/nginx.conf test failed
Failed to start nginx, abort deployment (Executor::NonZeroExitStatus)
...
[2020-05-29T01:51:26.099Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1/AppDeployEnactHook/01flip.sh] : Activity failed.
[2020-05-29T01:51:26.099Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1/AppDeployEnactHook] : Activity failed.
[2020-05-29T01:51:26.099Z] INFO [11395] - [Application update v1.3.3-1#3/AppDeployStage1] : Activity failed.
[2020-05-29T01:51:26.100Z] INFO [11395] - [Application update v1.3.3-1#3] : Completed activity. Result:
Application update - Command CMD-AppDeploy failed
The inability to deploy has been consistent for this environment, after several attempts, even reverting to an older version.
Afterwards, I resolved this by isolating the code and error messages using a local docker image with the zipfile. Running the code on my machine outside of docker did NOT reveal any problems, because the pip / pipenv part was missing some depdendency.
Steps for local docker testing:
WITHIN a docker container:
docker system prune
Go to the folder with Dockerfile
docker image build -t <app_name>:<version_number> .
TO run locally:
(docker rm <app_name> first, if you've already got a stopped container with the same name from prior testing)
docker container run --publish 80:80 --name <app_name> myapp:1.0
NOTE:
this won't let you test AWS functions that require environment variables, such as ~.aws credentials because they're not inside the image.
(but you could add them with your Dockerfile)
Once the docker container is running, you'll see (I saw) error messages that were not there when testing locally, because they were caused by a missing package dependency and a pipenv error.

Error reading from 192.168.1.164:44214: rpc error: code = Canceled desc = context canceled

i got this error when i am trying to connect peers running in different machines .I found this error in docker logs of orderer.There is an error in docker logs of peer2 running in different machine
Failed obtaining connection: Could not connect to any of the endpoints: [orderer.example.com:7050]
You can find the orderer.yaml file at fabric-samples/config folder.
Going through the fields and their respective comments in orderer.yaml and core.yaml can help you to understand the method of configuring the network(orderer/peer).
And here you can get the info related to TLS.

Anchore Engine - Jenkins CI plugin

We are trying to scan our docker images using Anchore Engine Jenkins plugin.
Currently we create our application docker images, push it in our own private local registry and then deploy it in our test environments.
Now, we want to setup docker image scanning in our CI/CD process to check for any vulnerabilities.
We have installed Anchore Engine using the recommended Docker-Compose yaml method given in the Documentation link:
https://anchore.freshdesk.com/support/solutions/articles/36000020729-install-on-docker-swarm
Post installation, we installed the
Anchore Container Image Scanner Plugin in Jenkins.
We configured the plugin as mentioned in the document link:
https://wiki.jenkins.io/display/JENKINS/Anchore+Container+Image+Scanner+Plugin
However, the scanning fails. Error Message as follows:
2018-10-11T07:01:44.647 INFO AnchoreWorker Analysis request accepted, received image digest sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8
2018-10-11T07:01:44.647 INFO AnchoreWorker Waiting for analysis of 10.180.25.2:5000/hello-world:latest, polling status periodically
2018-10-11T07:01:44.647 DEBUG AnchoreWorker anchore-engine get policy evaluation URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true
2018-10-11T07:01:44.648 DEBUG AnchoreWorker Attempting anchore-engine get policy evaluation (1/300)
2018-10-11T07:01:44.675 DEBUG AnchoreWorker anchore-engine get policy evaluation failed. URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true, status: HTTP/1.1 404 NOT FOUND, error: {
"detail": {},
"httpcode": 404,
"message": "image is not analyzed - analysis_status: not_analyzed"
}
NOTE:
In Image TAG 10.180.25.2:5000/hello-world:latest, 10.180.25.2:5000 is our local private registry and hello-world:latest is latest hello-world image available in docker hub which we pulled and pushed in our registry to try out image scanning using Anchore-Engine.
Unfortunately we are not able to find much resource online to try and resolve the above mentioned issue.
Anyone who might have worked on Anchore-Engine, please may I request to have a look and help us resolve this issue.
Also, any suggestions or alternatives to anchore-engine or detailed steps in case we might have missed anything would be really appreciated.
End of the output is as follows:
2018-10-15T00:48:43.880 WARN AnchoreWorker anchore-engine get policy evaluation failed. HTTP method: GET, URL: http://10.180.25.2:8228/v1/images/sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8/check?tag=10.180.25.2:5000/hello-world:latest&detail=true, status: 404, error: {
"detail": {},
"httpcode": 404,
"message": "image is not analyzed - analysis_status: not_analyzed"
}
2018-10-15T00:48:43.880 WARN AnchoreWorker Exhausted all attempts polling anchore-engine. Analysis is incomplete for sha256:7d6fb7e5e7a74a4309cc436f6d11c29a96cbf27a4a8cb45a50cb0a326dc32fe8
2018-10-15T00:48:43.880 ERROR AnchorePlugin Failing Anchore Container Image Scanner Plugin step due to errors in plugin execution
hudson.AbortException: Timed out waiting for anchore-engine analysis to complete (increasing engineRetries might help). Check above logs for errors from anchore-engine
at com.anchore.jenkins.plugins.anchore.BuildWorker.runGatesEngine(BuildWorker.java:480)
at com.anchore.jenkins.plugins.anchore.BuildWorker.runGates(BuildWorker.java:343)
at com.anchore.jenkins.plugins.anchore.AnchoreBuilder.perform(AnchoreBuilder.java:338)
at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
I also checked status and found below:
docker run anchore/engine-cli:latest anchore-cli --u admin --p admin123 --url http://172.18.0.1:8228/v1 system status
Service analyzer (dockerhostid-anchore-engine, http://anchore-engine:8084): up
Service catalog (dockerhostid-anchore-engine, http://anchore-engine:8082): up
Service policy_engine (dockerhostid-anchore-engine, http://anchore-engine:8087): down (unavailable)
Service simplequeue (dockerhostid-anchore-engine, http://anchore-engine:8083): up
Service apiext (dockerhostid-anchore-engine, http://anchore-engine:8228): up
Service kubernetes_webhook (dockerhostid-anchore-engine, http://anchore-engine:8338): up
Engine DB Version: 0.0.7
Engine Code Version: 0.2.4
It seems service policy engine is down
Service policy_engine (dockerhostid-anchore-engine, http://anchore-engine:8087): down (unavailable)
I also checked the docker logs . I found below error:
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [DEBUG] service (policy_engine) starting in: 4
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [INFO] Registration complete.
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [INFO] Checking feeds client credentials
[service:policy_engine] 2018-10-15 09:37:46+0000 [-] [bootstrap] [DEBUG] Initializing a feeds client
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [bootstrap] [DEBUG] init values: [None, None, None, (), None, None]
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [bootstrap] [DEBUG] using values: ['https://ancho.re/v1/service/feeds', 'https://ancho.re/oauth/token', 'https://ancho.re/v1/account/users', 'anon#ancho.re', 3, 60]
[service:policy_engine] 2018-10-15 09:37:47+0000 [-] [urllib3.connectionpool] [DEBUG] Starting new HTTPS connection (1): ancho.re
[service:policy_engine] 2018-10-15 09:37:50+0000 [-] [bootstrap] [ERROR] Preflight checks failed with error: HTTPSConnectionPool(host='ancho.re', port=443): Max retries exceeded with url: /v1/account/users/anon#ancho.re (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7ffa905f0b90>: Failed to establish a new connection: [Errno 113] No route to host',)). Aborting service startup
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/anchore_manager/cli/service.py", line 158, in startup_service
raise Exception("process exited: " + str(rc))
Exception: process exited: 1
[anchore-policy-engine] [anchore_manager.cli.service/startup_service()] [INFO] service process exited at (Mon Oct 15 09:37:50 2018): process exited: 1
[anchore-policy-engine] [anchore_manager.cli.service/startup_service()] [INFO] exiting service thread
Thanks and Regards,
Rohan Shetty
When images are added to anchore-engine, they are queued for analysis which moves them through a simple state machine that starts with ‘not_analyzed’, goes to ‘analyzing’ and finally ends in either ‘analyzed’ or ‘analysis_failed’. Only when an image has reached ‘analyzed’ will a policy evaluation be possible.
The anchore Jenkins plugin will add an image, then poll the engine for image status/evaluation for the configured number of tries (default 300). Once the image goes to ‘analyzed’ (where policy evaluation is possible), the plugin will then receive a policy evaluation result from the engine.
The plugin will fail the build (by default) if the max retries has been performed and the image has not reached ‘analyzed’, if the image does reach ‘analyzed’ but the policy evaluation is producing a ‘fail’ result (meaning the image didn’t pass your configured policy checks). Note that all build failure behavior can be controlled in the plugin (I.e. there are options to allow the plugin to succeed even if the analysis or image eval fails).
You’ll need to look at the end of the output from your build run (instead of just the beginning from your post), and combined with the information above, it should be clear which scenario is causing the plugin to fail the build.
We have resolved the issue.
Root Cause:
We were not able to establish a successful https connection to URL : https://ancho.re from within the anchore-engine docker container.
As a result the service:policy_engine was not able to start.
https://ancho.re is required to download policy feeds and sync-up periodically. Without these policy anchore-engine won't be able to analyse the docker images.
Solution:
1) We passed a HTTPS_PROXY URL as an environment variable in the docker-compose.yaml of anchore-engine.
We used this proxy URL to bypass restrictions in our environment and establish a connection with https://ancho.re url.
2) Restarted the docker containers.
Finally we got all services up and running including Anchore policy-engine.
FYI:
It takes a while to download all the required Feeds depending on your internet speed.
Lastly, Thanks to the Anchore community for quick responses and support over slack.
Hope this helps.
Warm Regards,
Rohan Shetty

Apache Spark compile failed while installing Netty

We untar spark-0.9.0-incubating.tgz and trying to build it for use with Yarn.
SPARK_HADOOP_VERSION=2.0.0-cdh4.6.0 SPARK_YARN=true sbt/sbt assembly
...
[info] Resolving io.netty#netty-all;4.0.13.Final ...
[error] Server access Error: Connection timed out url=https://oss.sonatype.org/content/repositories/snapshots/io/netty/netty-all/4.0.13.Final/netty-all-4.0.13.Final.pom
[error] Server access Error: Connection timed out url=https://oss.sonatype.org/service/local/staging/deploy/maven2/io/netty/netty-all/4.0.13.Final/netty-all-4.0.13.Final.pom
...
If I just cut-paste the url into a browser, I get:
404 - ItemNotFoundException
Retrieval of /io/netty/netty-all/4.0.13.Final/netty-all-4.0.13.Final.pom from M2Repository(id=snapshots) is forbidden by repository policy SNAPSHOT.
org.sonatype.nexus.proxy.ItemNotFoundException: Retrieval of /io/netty/netty-all/4.0.13.Final/netty-all-4.0.13.Final.pom from M2Repository(id=snapshots) is forbidden by repository policy SNAPSHOT.
at org.sonatype.nexus.proxy.maven.AbstractMavenRepository.doRetrieveItem(AbstractMavenRepository.java:380)
at org.sonatype.nexus.proxy.maven.maven2.M2Repository.doRetrieveItem(M2Repository.java:396)
at org.sonatype.nexus.proxy.repository.AbstractRepository.retrieveItem(AbstractRepository.java:765)
at org.sonatype.nexus.proxy.repository.AbstractRepository.retrieveItem(AbstractRepository.java:608)
at org.sonatype.nexus.proxy.router.DefaultRepositoryRouter.retrieveItem(DefaultRepositoryRouter.java:155)
at org.sonatype.nexus.web.content.NexusContentServlet.doGet(NexusContentServlet.java:359)
at org.sonatype.nexus.web.content.NexusContentServlet.service(NexusContentServlet.java:331)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
I have seen this reported in a number of places but no solution. Is this error because we are behind a corporate firewall, or is this due to something else? Please advise.
I had proxy set as environment variables, but it appears they are not being picked up. Adding them in sbt directly worked for me.
Edit $SPARK_HOME/sbt/sbt
For example,
EXTRA_ARGS="-Dhttp.proxySet=true -Dhttp.proxyHost=myproxy.mycompany.com -Dhttp.proxyPort=80 -Dhttps.proxySet=true -Dhttps.proxyHost=myproxy.mycompany.com -Dhttps.proxyPort=80 -Dftp.proxySet=true -Dftp.proxyHost=myproxy.mycompany.com -Dftp.proxyPort=80 -Dhttp.nonProxyHosts=mydomain -Dhttps.nonProxyHosts=mydomain -Dftp.nonProxyHosts=mydomain"

Resources