When application containers talk to each other, they usually talk via the docker0 interface of the host. Either with other containers running on the same host or with containers running in the same cluster (Kubernetes or Swarm for example). In the latter case the cluster nodes use a trick/workaround using a third party communication channel, e.g. IPSec tunnels or CNI for a specific subnet to allow container communication across nodes.
This cross-node communication works (usually) very well and is (normally) hassle-free, as long as they run in the same LAN.
But what if application containers are physically separated into different locations and the hosts are not part of the same (Kubernetes) cluster?
In this particular scenario the same application is deployed into two clusters in two different locations. Because the application is cluster-aware it knows the other cluster peers from a central storage and tries to communicate with them. And because the application is started inside the container with its own internal address, the application now tries to connect to other containers using the internal address, e.g. 10.42.93.71. And this fails, of course.
There are a couple of possibilities to achieve this. But depending on how the application inside the container works, not all of them might lead to a solution.
The first obvious way would be to use public DNS records. Once the application is started, it announces its own public DNS record into the cluster storage. The public DNS obviously needs to be known to the container during startup. This could be a shell-script which determines the own public DNS record based on the public IP (using curl ifconfig.co for example) or this information could come from an environment variable during startup of the container.
The problem in this potential solution? If you run this application in multiple containers on each location, you might end up with a lot of DNS entries and even more Ingress rules. And you might even lose deployment flexibility and salability, depending on your cluster configuration.
This would have been a solution if the application is only deployed once on each location. In our case, this would not work out.
Rancher recently announced a new project called Submariner. The goal of this project is to enable communications between Kubernetes clusters.
Image Source: submariner.io.
However Submariner is not yet considered production ready, so one should be cautious. That's also the reason why this potential solution was not chosen.
The architecture drawing shows a VPN tunnel between the two locations. This means that the Docker hosts on each location are able to communicate with each other, however the container's can't. Sure, the hosts of each location could all be part of the same cluster, but to avoid cluster issues with a flaky VPN connection, separate location-based clusters were built (and this proved to be a good choice!).
So what if the application wouldn't announce its own container IP address, but use the primary IP address of the Docker host instead? This, in combination with exposing the application port to the host (and not using HTTP Ingress), allows communication between the containers across locations through the VPN tunnel. Because the Docker host addresses are used, all traffic happens through the VPN tunnel and is on each host translated to forward the traffic to the corresponding container.
This scenario even works when the application is deployed several times on each location, as long as it is not deployed more than once on the same host.
But how can the container dynamically know the main IP address of its host? Especially if it is not known which interface is the primary one (there may be eth0, ens160, eno16777984, bond1, etc)?
One way we came up with is to use the hostname of the host. The primary IP address can be found in /etc/hosts:
ckadm@dockerhost1:~$ grep $(hostname) /etc/hosts
192.168.252.201 dockerhost1.example.com dockerhost1
Note: Of course this requires a correctly setup hostname and /etc/hosts!
Using this method, the hosts's primary IP address can be read and saved into an environment variable during start of a container:
root@dockerhost1:~# docker run -it -e "DOCKER_HOST_IP=$(grep $(hostname) /etc/hosts | awk '{print $1}')" ubuntu /bin/bash
Inside the container, the host's primary IP can be shown using the environment variable:
root@fcbb308c580c:/# echo $DOCKER_HOST_IP
192.168.252.201
Of course the application port for cluster communication needs to be exposed, too.
The containers are started on both locations using an expose port 8888 on the host, resolving to port 80 inside the container.
root@dockerhost1:~# docker run -it -p8888:80 -e "DOCKER_HOST_IP=$(grep $(hostname) /etc/hosts | awk '{print $1}')" nginx /bin/bash
root@dockerhost2:~# docker run -it -p8888:80 -e "DOCKER_HOST_IP=$(grep $(hostname) /etc/hosts | awk '{print $1}')" nginx /bin/bash
curl and ping were installed in both containers to make a test connection across the locations:
root@0e73605363b6:/# apt-get update && apt-get install curl iputils-ping net-tools
root@3d7551f8ed7f:/# apt-get update && apt-get install curl iputils-ping net-tools
The host IP address is verified:
root@0e73605363b6:/# echo $DOCKER_HOST_IP
192.168.252.201
root@3d7551f8ed7f:/# echo $DOCKER_HOST_IP
10.10.1.112
Note: The end application with adjusted settings would now announce this IP address and the exposed listener port to the central storage.
Now the communication using the host IP's can be tested:
root@0e73605363b6:/# ping 10.10.1.112
PING 10.10.1.112 (10.10.1.112) 56(84) bytes of data.
64 bytes from 10.10.1.112: icmp_seq=1 ttl=61 time=10.4 ms
64 bytes from 10.10.1.112: icmp_seq=2 ttl=61 time=9.47 ms
64 bytes from 10.10.1.112: icmp_seq=3 ttl=61 time=9.34 ms
^C
root@3d7551f8ed7f:/# ping 192.168.252.201
PING 192.168.252.201 (192.168.252.201) 56(84) bytes of data.
64 bytes from 192.168.252.201: icmp_seq=1 ttl=61 time=10.4 ms
64 bytes from 192.168.252.201: icmp_seq=2 ttl=61 time=9.45 ms
64 bytes from 192.168.252.201: icmp_seq=3 ttl=61 time=9.38 ms
^C
Checking nginx is running and listening:
root@0e73605363b6:/# nginx
root@0e73605363b6:/# netstat -lntup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 2852/nginx: master
root@3d7551f8ed7f:/# nginx
root@3d7551f8ed7f:/# netstat -lntup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 2847/nginx: master
And testing communication via the exposed port, through the VPN tunnel, using curl:
root@0e73605363b6:/# curl http://10.10.1.112:8888 -I
HTTP/1.1 200 OK
Server: nginx/1.17.4
Date: Wed, 25 Sep 2019 10:22:23 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Sep 2019 14:49:10 GMT
Connection: keep-alive
ETag: "5d8a2ce6-264"
Accept-Ranges: bytes
root@3d7551f8ed7f:/# curl http://192.168.252.201:8888 -I
HTTP/1.1 200 OK
Server: nginx/1.17.4
Date: Wed, 25 Sep 2019 10:26:12 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Sep 2019 14:49:10 GMT
Connection: keep-alive
ETag: "5d8a2ce6-264"
Accept-Ranges: bytes
Using docker run the environment variables can set with dynamic values, for example from a command output (as shown before). Unfortunately this does not work when images are built using a Dockerfile. A workaround (for the workaround) needs to be found.
By default the Docker container runs in a "managed" network in a subnet used by all the containers. But there's also the possibility to use the host's network directly. In our infrastructure we use Rancher and the service settings allow to use the Host network:
Inside the container this will now look like this:
root@dockerhost2:/# ip a sh dev eth0
2: eth0:
link/ether 02:6d:d2:e4:74:83 brd ff:ff:ff:ff:ff:ff
inet 10.10.1.112/24 brd 10.10.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::6d:d2ff:fee4:7483/64 scope link
valid_lft forever preferred_lft forever
root@dockerhost2:/# hostname
dockerhost2
So both IP address and hostname are taken from the host. This way this information can be used during the entrypoint script:
# Use Docker host hostname to find primary IP address of Docker host
# note: requires Docker container to use host network (Rancher -> Service -> Networking -> Network: Host)
DOCKER_HOST_IP=$(grep $(hostname) /etc/hosts | awk '{print $1}')
export $DOCKER_HOST_IP
This variable $DOCKER_HOST_IP can then be used for further actions.
Using the Docker host's primary IP address and expose ports can serve as a workaround to enable a cross-cluster-communication between containers. In this example a VPN tunnel was used, but this could also be replaced by a public IP (and/or DNAT) to the Docker host. However the most important point is the application itself: Although the application was built cluster-aware, it did not count the possibility to run a cluster across multiple networks, at least not inside an application container.
Either the application needs to allow such scenarios or on system side a workaround needs to be implemented. We chose the latter, ergo this article ;-).
If the whole Docker/Kubernetes infrastructure is too much hassle and you simply want to enjoy an available Kubernetes cluster to deploy your applications, check out the Private Kubernetes Cloud Infrastructure at Infiniroot!
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder