Recently, Chaperone got an exciting new feature that makes it possible to dynamically launch scripts when a port connection occurs. This is an old idea, and in fact was the purpose of the original inetd “super server” that was part of BSD 4.3 in 1986!
It turns out that this feature is really ideal for lean Docker microservice containers because no processes are started unless a port connection occurs. It makes it possible to publish one or more ports that perform auxiliary functions without consuming a daemon process to do so.
One extremely useful application is putting monitoring scripts inside the container so they can use knowledge of the application itself to perform better service monitoring. The monitoring scirpts are also part of the container codebase rather than external to it, so as the software evolves, the monitoring scripts can be kept up-to-date by the application developer.
One of our clients shared some code to do just that with a container which implements a single-process Python directory microservice. It relies upon a RethinkDB database which lives in another container, but latency issues have caused the service to often fail unexpectedly.
I’ll start with a simple example that’s similar to theirs, then show you the script they ended up with.
The Simple and Easy Solution
With just a few lines of Python, here is an internal service monitor which returns a consistent JSON response to describe the container’s health:
#!/usr/bin/python3 import json import urllib.request import urllib.error status = {'status': 'OK'} # status if no problems try: with urllib.request.urlopen("http://localhost:5001/version") as response: jresp = json.loads(response.read().decode()) except Exception as err: status = {'status': 'ERROR', 'message': "Service problem: " + str(err)} print(json.dumps(status)) # send the result over the socket port
The script above checks to see if the directory service (running on port 5001) responds normally with its version number (a JSON response).
Then, here is the Chaperone configuration which “connects” the port to the above script (which was /serv_app/monitor
inside their container):
# Define monitor service on port 7101 monitor-port.service: { type: inetd, port: 7101, command: "/serv_app/monitor", }
The above configuration tells Chaperone to listen on TCP port 7101 for any connections, then when one is received, launch the /serv_app/monitor
script and connect the socket to stdin and stdout. That makes it easy for simple scripts to act as genuine TCP services.
Once the container is running, any TCP client can check its status. For example, assume that the container exposes port 7101 at service.example.com.
Accessing the port yields a simple JSON result if the service is ok:
$ nc service.example.com 7101 {"status": "OK"}
and a structured error response if there is a problem:
{"status": "ERROR", "message": "Service does not respond: <urlopen error [Errno 111] Connection refused>"}
That’s all there is to it! Now, port 7101 can be used by a wide variety of service monitoring tools, and it’s possible to support consistent monitoring for all containers.
The Final Monitoring Script
The above is quite simplified, and our client wanted to do a bit more. Primarily, they wanted to not only check to see if the service was running in the container, but also wanted to check to be sure the RethinkDB instance was reachable, and provide diagnostic information to the monitoring tool.
The final script ended up looking like this:
#!/usr/bin/python3 import os import json import rethinkdb import urllib.request import urllib.error def result(err = None): # returns an OK-result or error in JSON r = {'monitor-version': 1.0, 'status': 'OK'} if err: r['status'] = 'ERROR' r['message'] = err print(json.dumps(r)) exit(0) def check_services(): # Try the service INSIDE the container first try: with urllib.request.urlopen("http://localhost:5000/version") as response: jresp = json.loads(response.read().decode()) version = jresp['dirserv-version'] if version != 1.0: result("dirserv-version has unexpected value: " + str(version)) except urllib.error.URLError as err: result("Service does not respond: " + str(err)) HOST = os.environ["HOST_IP"] PORT = 28015 # The local service is OK, make sure we are also seeing the # RethinkDB instance. try: db = rethinkdb.connect(HOST, PORT) except rethinkdb.errors.ReqlDriverError as derr: result("RethinkDB endpoint error: " + str(derr)) result() try: check_services() except Exception as ex: result("Unexpected error: " + str(ex)) result()
Their entire Chaperone configuration for both the app and the monitoring service is here:
settings: { env_set: { # Derive the IP of our docker host from the default route HOST_IP: "`ip route | awk '/default/ {print $3}'`", } } # Define cluster directory service clusterdir.service: { type: simple, command: "python3 /serv_app/cluster-directory.py", service_groups: "IDLE", } # Define monitor service on port 7101 monitor-port.service: { type: inetd, port: 7101, command: "/serv_app/monitor", }
Note above how HOST_IP
is derived, and points to the interface of the Docker host. In their actual script, the HOST_IP
is provided on the docker run command line, since the RethinkDB instance is not always running on the same host.
Futher Reading
- If you’re not familiar with Chaperone, here is a good starting point.
- Here is a similar monitoring script by Mike Peters which uses xinetd. The same technique works great with Chaperone.
- If you’re really feeling crazy, you can take advantage of many services which were written specifically to work with inetd, like git! Here is an overview of how you can run git as a dynamic port-connected service. (Write me and tell me if you did this!)
- Here’s an article that gives a good overview of some of the external container monitoring options for Docker containers. The problem with these is that they all require knowledge of what the container is doing and provide limited ability to monitor the application specifics like the above solution.