Cannot reach services exposed by docker containers on Ubuntu 18.04 - linux

I've been struggling with a strange problem on Ubuntu 18.04. I cannot reach services exposed by containers. I will show you on an example with nginx.
Starting the container:
sudo docker run -it --rm -d -p 8080:80 --name web nginx
docker ps shows:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f09c71db299a nginx "/docker-entrypoint.…" 10 minutes ago Up 10 minutes 0.0.0.0:8080->80/tcp web
listening is on 0.0.0.0:8080 as expected
But curl throws "Connection reset by peer":
$ curl -4 -i -v localhost:8080
* Rebuilt URL to: localhost:8080/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.58.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* stopped the pause stream!
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
I used tshark to inspect a network traffic:
$ sudo tshark -i any
66 7.442606878 127.0.0.1 → 127.0.0.1 TCP 68 8080 → 47430 [ACK] Seq=1 Ack=79 Win=65408 Len=0 TSval=4125875840 TSecr=4125875840
67 7.442679088 172.16.100.2 → 172.16.100.1 TCP 56 80 → 37906 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
68 7.442784223 127.0.0.1 → 127.0.0.1 TCP 68 8080 → 47430 [RST, ACK] Seq=1 Ack=79 Win=65536 Len=0 TSval=4125875840 TSecr=4125875840
I see RST within the container(?). I have never had such an issue and I'm a bit lost how to solve it. Can someone help me out?
UPDATE: I used docker inspect f09c71db299a and it shows that:
"Gateway": "172.16.100.1"
"IPAddress": "172.16.100.2"
172.16.100.1 it's my docker0 IP address. It looks it rejects traffic from the container, right?
UPDATE 2: According to NightsWatch's suggestion I checked if the host accepts connection on the 8080. Telnet says:
~$ telnet localhost 8080
Trying ::1...
Connected to localhost.
Escape character is '^]'.
So it looks port is open but the request is blocked :/

Related

Spark in Kubernetes Connection Refused

I am trying to deploy a Spark job in a Kubernetes cluster (running on AWS EKS). I deploy a pod that executes spark-submit in client mode. The pod becomes the driver pod and then begins to launch executor pods. The executor pods try to connect to driver but fail causing the executors to crash. Here is the error message from the executor log:
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: data-loom-stats/10.135.131.239:9902
Caused by: java.net.ConnectException: Connection refused
The driver pod is exposed thru a headless Kubernetes service (per recommendations by Spark: https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-networking). The service exposes the driver with the DNS name data-loom-stats. Based upon the error message the DNS resolution appears to be working since it is correctly translating it to the pod IP address 10.135.131.239. To see what is happening on the driver end I opened a shell in the running driver container and was able to netstat the listening ports:
[root#data-loom-stats-7496b69994-9t8zs work-dir]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:4040 0.0.0.0:* LISTEN 673/java
tcp 0 0 127.0.0.1:40077 0.0.0.0:* LISTEN 673/java
tcp 0 0 127.0.0.1:9902 0.0.0.0:* LISTEN 673/java
tcp 0 0 0.0.0.0:41267 0.0.0.0:* LISTEN 673/java
As you can see port 9902 is bound to the loopback IP address. Port 4040 is the Spark UI and it is bound to 0.0.0.0. Since the executor pods are not stable I did some testing from another pod that is. I was able to curl port 4040:
/merida/src # curl -v http://10.135.131.239:4040
* Trying 10.135.131.239:4040...
* TCP_NODELAY set
* Connected to 10.135.131.239 (10.135.131.239) port 4040 (#0)
> GET / HTTP/1.1
> Host: 10.135.131.239:4040
> User-Agent: curl/7.67.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< Date: Fri, 29 May 2020 22:50:46 GMT
< Location: http://10.135.131.239:4040/jobs/
< Content-Length: 0
< Server: Jetty(9.3.z-SNAPSHOT)
<
* Connection #0 to host 10.135.131.239 left intact
But trying to connect to port 9902 gives the connection refused error, just like the driver log.
/merida/src # curl -v http://10.135.131.239:9902
* Trying 10.135.131.239:9902...
* TCP_NODELAY set
* connect to 10.135.131.239 port 9902 failed: Connection refused
* Failed to connect to 10.135.131.239 port 9902: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 10.135.131.239 port 9902: Connection refused
So it appears that my address/port binding needs to be fixed. Is this conclusion correct? If so is this something I can fix in the k8s manifest, or is it caused by something in the spark configuration?
I can supply more to help to identify a root cause.

On Ubuntu 18.04, i cannot access Tomcat from a browser using IP address

I've installed Tomcat 9 on Ubuntu 18.04(VM). I cannot access tomcat using IP address from a browser (or curl)
On the VM, tomcat is running and curl http://1.2.3.4:8080 works.
But the same externally does not..
l-OSX: hal$ curl https://10.51.253.163:8080 -v
* Rebuilt URL to: https://10.51.253.163:8080/
* Trying 10.51.253.163...
* connect to 10.51.253.163 port 8080 failed: Operation timed out
* Failed to connect to 10.51.253.163 port 8080: Operation timed out
Tomcat's server.xml
<Engine name="Catalina" defaultHost="10.51.253.163">
...
<Host name="10.51.253.163" appBase="webapps"
unpackWARs="true" autoDeploy="true">
UFW is Inactive
sudo ufw status verbose`
Status: inactive`
Ping to the VM works
l-OSX: hal$ ping 10.51.253.163
PING 10.51.253.163 (10.51.253.163): 56 data bytes
64 bytes from 10.51.253.163: icmp_seq=0 ttl=58 time=111.914 ms
64 bytes from 10.51.253.163: icmp_seq=1 ttl=58 time=93.793 ms
Appreciate any help on this!
After some research and help from IT Support team, i was able to resolve this as below:
VM > Manage Security
Add Security Rule
Allow Port: 8080 on Protocol: TCP
Able to access Tomcat from browser.

Why does Node.js/Express not accept connections from localhost?

I encountered this strange behavior today I could not find a cause for. I am using MacOS Sierra.
I have this code (Express):
app.server.listen(config.port, config.address, function () {
logger.info('app is listening on', config.address + ':' + config.port);
});
And it prints
app is listening on 127.0.0.1:5000
How ever, if I try to curl, it fails.
$ curl http://localhost:5000/api/ping
curl: (56) Recv failure: Connection reset by peer
I checked my hosts file:
$ cat /etc/hosts
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
So I ping localhost to make sure it resolves to 127.0.0.1:
$ ping localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.061 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.126 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.135 ms
^C
--- localhost ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.061/0.107/0.135/0.033 ms
I try again, but it fails
$ curl http://localhost:5000/api/ping
curl: (56) Recv failure: Connection reset by peer
Now I try to use 127.0.0.1 instead and voila, it works?
$ curl http://127.0.0.1:5000/api/ping
pong
What's wrong?
cURL is trying to connect via IPv6 but your Express server is listening on 127.0.0.1 which is IPv4.
You can force cURL to connect via IPv4 with the -4 option.
curl -4 http://127.0.0.1:5000/api/ping

AWS ec2 Node.js Application - Forbidden Port 80 (403 Response)

When I am running an open source application "atwork" (https://github.com/ritenv/atwork) on an ec2 machine instance on port 80, I get server responses with forbidden codes (403):
AtWork running at 0.0.0.0:80
GET / 304 3.802 ms - -
GET /users/notifications 403 3.972 ms - 9
GET /posts?limitComments=true 403 0.956 ms - 9
GET /chats 403 1.289 ms - 9
GET /streams?subscribed=true 403 0.708 ms - 9
GET /streams?unsubscribed=true 403 0.859 ms - 9
GET /users/me 403 0.847 ms - 9
GET /system-settings 304 4.803 ms - -
GET /favicon.ico 304 0.453 ms - -
GET /system-settings 304 2.766 ms - -
GET /favicon.ico 304 0.322 ms - -
However, when I run it on another port (8080), I get the following 200 messages from
the server:
AtWork running at 0.0.0.0:8080
GET / 200 4.219 ms - 6412
GET /users/notifications 304 12.189 ms - -
GET /posts?limitComments=true 304 5.162 ms - -
GET /chats 304 4.344 ms - -
GET /streams?unsubscribed=true 304 5.429 ms - -
GET /streams?subscribed=true 304 5.495 ms - -
GET /users/me 200 3.478 ms - 882
GET /system-settings 304 4.809 ms - -
Kirill A Novik is online.
GET /favicon.ico 304 0.795 ms - -
I have tried the following (However, none of it worked):
Modify firewall options in the security groups on the AWS console allowing all tcp traffic on all ports.
Run iptable like this:
iptables -F
iptables -X
iptables -t nat -F
iptables -t nat -X
iptables -t mangle -F
iptables -t mangle -X
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -A INPUT -p tcp --dport 80 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -p tcp --sport 80 -m state --state ESTABLISHED -j ACCEPT
Please help me to understand what I am doing wrong, and how I could make port 80 behave like other ports.
Thank you.
There is 2 possibilities to fix this issue.
First, is to give the root permissions for ec2 machine's user, who runs the application. But it can be a security risk - running application as root.
The seconds is, the one i i prefer:
running nodejs application as limited user, but behind reverse proxy.
Application can listen on ports > 1000 - like 8080 one.
And you can run NGINX as revers proxy. It will listen on 80 or 443 port, and transfer requests to your nodejs application.
You can use nginx configs like this - https://github.com/vodolaz095/hunt/blob/master/examples/serverConfigsExamples/nginx.conf

TCP listening socket is not created

I'm developing a Qt application and experience rather weird network issue.
Let me show how it looks from end-user perspective.
First I start up my server and verify that it's listening on a target port:
[user#host server]$ sudo netstat -anp | grep 30004
tcp 0 0 0.0.0.0:30004 0.0.0.0:* LISTEN 11113/./server
Then I connect to the server with telnet:
[user#host server]$ telnet localhost 30004
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.
Netstat displays that connection is now established. Nothing fancy so far:
[user#host server]$ sudo netstat -anp | grep 30004
tcp 0 0 0.0.0.0:30004 0.0.0.0:* LISTEN 11113/./server
tcp 0 0 127.0.0.1:30004 127.0.0.1:34608 ESTABLISHED 11113/./server
tcp 0 0 127.0.0.1:34608 127.0.0.1:30004 ESTABLISHED 12657/telnet
Then the server drops the connection based on application-specific timeout. It is set to 10 seconds at the moment:
[user#host server]$ sudo netstat -anp | grep 30004
tcp 0 0 0.0.0.0:30004 0.0.0.0:* LISTEN 11113/./server
tcp 0 0 127.0.0.1:30004 127.0.0.1:34608 TIME_WAIT -
I then shut down the server and verify that the listenning socket is destroyed:
[user#host server]$ sudo netstat -anp | grep 30004
tcp 0 0 127.0.0.1:30004 127.0.0.1:34608 TIME_WAIT -
Finally I start up the server again, but the listening port doesn't show up anymore:
[user#host server]$ sudo netstat -anp | grep 30004
tcp 0 0 127.0.0.1:30004 127.0.0.1:34608 TIME_WAIT -
As a result client cannot connect to the server:
[user#host server]$ telnet localhost 30004
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
What am I doing wrong here? I'm inclined to think that this is a configuration issue, not a bug in the application.
This scenario seems to work on my laptop's Ubuntu. The aforementioned output was produced on linux box as well.
UPDATE: One more thing that is different in these two environemnt is qt version. On my notebook I have 4.8.6, on linux box it's 4.6.2. Not sure if it matters.
Apparently there was an issue with versions of qt libraries. We upgraded it to latest 4.x.x and now the problem seems to be resolved.

Resources