I just went through some very strange debugging after running
sudo systemctl stop nginx
sudo /opt/bitnami/letsencrypt/lego --tls --email="..." --domains="..." --path="/opt/bitnami/letsencrypt" renew --days 90
sudo systemctl start nginx
I was getting a 502 error, and many errors of the form
[error] 25208#25208: *1 connect() failed (111: Connection refused) while connecting to upstream, client
I had multiple domains running on this server, but I only updated one of their ssl certs. The other domains were still up, but the one that was updated began erroring out with 502. After endless Google searches, it kept pointing to an IPv6 issue, and changing localhost to 127.0.0.1 in the nginx config, or to a mismatch of ports between nginx and node. It turned out that somehow forever had just stopped, without leaving any indication, e.g. I couldn't find anything from today in ~/.forever. I just want to know if i'm missing anything obvious, this was not the first time that I updated ssl certs, and I did the exact same thing last time without this happening.
Related
I recently installed Nextcloud over a lamp stack and want to run Traefik in front. For that, I tweaked the apache2 ports.conf to:
Listen: 127.0.0.1:180
. Now I also configured a .toml for Traefik that points to this address.
When I try to open the website, it gives me "Bad Gateway".
Trying to solve the error I searched the Traefik logs and found this:
msg="'502 Bad Gateway' caused by: dial tcp 127.0.0.1:180: connect: connection refused"
Thinking it must be a problem with trusted_proxies I configured Apache to open it's port to the public and also changed the Traefik .toml to see wheter it would work.
It did. That means that Nextcloud definetly accepts my proxy and the proxying works all good.
Problem is, It doesn't work when I configure it on localhost.
The access.log and nextcloud.log show nothing.
Any help?
Many thanks
The solution is simple, but hidden.
Traefik is a Docker container, so normally it can't communicate with services not in the docker network.
The fix is:
ip addr show docker0
Bind Apache2 to this IPv4: (my example) Listen 172.17.0.1:180 and also modify the Traefik Config.
Then Apache2 will listen on the docker0 network which containers have access to.
We have setup an okd 3.11 cluster with 100+ nodes. Everything was working fine but then a worker node stopped resolving the registry service internal url. This causes new pods to be scheduled to that node fail with ImagePullBackoff error.
Failed to pull image "docker-registry.default.svc:5000/app-name/app-name:latest": rpc error: code = Unknown desc = Get https://docker-registry.default.svc:5000/v1/_ping: dial tcp: lookup docker-registry.default.svc on 10.*.*.71:53: server misbehaving
We tried running nslookup on the worker node and following were the results
While this doesn't work (while it works on other nodes)
[root#worker22 ~]# nslookup docker-registry.default.svc.cluster.local
Server: 10.*.*.71
Address: 10.*.*.71#53
** server can't find docker-registry.default.svc.cluster.local: SERVFAIL
This works just fine.
[root#worker22 ~]# nslookup docker-registry.default.svc.cluster.local 127.0.0.1
Server: 127.0.0.1
Address: 127.0.0.1#53
Name: docker-registry.default.svc.cluster.local
Address: 172.*.*.212
Adding server=/cluster.local/172.30.0.1 to dnsmasq conf file /etc/dnsmasq.d/origin-upstream-dns.conf works as a work around but can't find what is causing this.
I have tried adding -q to dnsmasq service's ExecStart and it shows that the dnsmasq won't query the openshift dns running locally at 127.0.0.1:53.
Dnsmasq config/resolv.conf is in order on the node.
I have tried restarting dnsmasq/NetworkManager/Docker, I have tried respawning ovs/sdn pods but still no help.
Found some documented evidence that dnsmasq can behave like that.
It has been suggested by some RedHat articles that a long running dnsmasq service may misbehave and stop resolving names. Similar cases have been reported for openshift environment as well.
The links below suggest that restarting the service would solve the problem for some time and then the issue may resurface. As stated earlier, in my case service restart didn't help but oldest remedy in IT worked (rebooting the node solved the problem).
Reference:
https://access.redhat.com/solutions/3393141
https://bugzilla.redhat.com/show_bug.cgi?id=1560489
We do frequent deployments using udeploy and there is a last step there to automatically restart apace http server using sudo ./apachectl -k restart.
But sometimes the server fails to restart with below error:-
(98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
Please note not all the times only sometimes. I verified everything with no duplicate Listen directives for port 80 in httpd conf files, no password prompt issues in ssl key files. I don't have root access to server so can't actually verify if there are any other processes that bind port 80 before main apache server starts. But is there anything else that could be causing the issue.
Any help or suggestions here would be greatly appreciated.
Cheers,
Ashley
Not sure of the timing, but perhaps add a second attempt to start when there is a failure and that might allow for enough time to free any resources which might be in use.
I downloaded the Tardis branch of RedQueryBuilder and did an mvn clean install.
It runs through things for a bit then it gets to this part
[INFO] Running com.redspr.redquerybuilder.core.client.GwtTestDom
[INFO] logging for HtmlUnit thread
[INFO] [ERROR] I/O error on HTTP request
[INFO] org.apache.http.conn.HttpHostConnectException: Connection to http://50.19.99.237:53655 refused
Just wondering if there's a quick answer, like, oh your gwt is out of date, or some such other easy to solve issue.
So here was the problem, had nothing to do with GAE.
The problem was that the name of my host, in /etc/hostname, had no corresponding host entry in /etc/hosts. It was complicated by the fact that I had "search mydomain.com" in my resolv.conf, which was further complicated by the fact that mydomain.com is wildcarded, so any unknown hostnames resolve to a particular IP address.
So what happend was, the test suite would look for myhost, since it didn't find myhost in DNS or in /etc/hosts, it looked up myhost.mydomain.com, as a result of my resolv.conf, which returned a valid IP address, because of the wildcard. Then the test suite got a connection refused, because it was connecting to a totally different host. So the solution was to add 127.0.0.1 myhost myhost.mydomain.com to my /etc/hosts and it built and ran fine.
Long story short, if the host defined in /etc/hostname does not have a valid DNS or /etc/hosts entry, the build will fail, as it will either get a host unknown, or in my case a connection refused because of my DNS jiggery pokery.
My localhost IIS sites recently stopped working and I can't figure out why. If I try and browse to http://localhost after a while I get the error Oops! Google Chrome could not connect to localhost. If I open Fiddler and try again I get a 502 error that states System.Net.Sockets.SocketException A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 127.0.0.1:80
I've tried using netstat -a -b to see if any other applications are blocking port 80, but there doesn't appear to be anything obvious.
I've disabled proxy servers and that doesn't have any affect.
As a last resort I even tried re-installing IIS
Everything has been working fine and I can't think of any configuration changes that would've stopped localhost from working. Any ideas?