I needed to ping HTTPS endpoint of our monitoring service from a small internet router that runs basic UNIX system. I was making monitor of WAN connection on that router. The problem is that this particular router has very limited toolbox in it’s shell – and exactly lacks
wget with SSL support.
Fortunately it has
openssl and that comes with it’s embedded command
s_client. This is very simple low-level SSL connection tool. It connects to endpoint, makes SSL handshake and establish transparent SSL tunnel that you can write to and read from via standard shell pipes.
That allows me to drop simple HTTP request to ping the service and close connection. This is the shell function that does the trick:
openssl s_client -host \$1 -port 443 -ign_eof << HTTP
GET \$2 HTTP/1.1
It expect two arguments:
hostname to connect to and
path to make the GET request to.
There are two important things that you may not notice. First is
-ign_eofparameter which keeps openssl running even after the input ends. Because I just use direct shell pipe, the input is send to command and ends immediately. With the
-ign_eof openssl would ends as well way sooner than it even sends first packet. And it needs to send a lot before SSL tunnel is ready.
Connection: close. Because I asked openssl to ignore input end, the other way to stop is that remote connection is closed. By default however our HTTP server keeps connection open – in
keep-alive mode. This HTTP header tells server to close as soon as the request is done and hence openssl will stop and return control to the monitoring bash script, that in loop checks connection and call the function.
I setup the above script in the router’s startup scripts and was running it for few days. But then suddenly the monitoring service stopped receiving ping from the router.
After initial shock when I received “Internet is down” message and then checked that it is false alarm, I opened router SSH once more to see what is going on.
After quick look at output of
openssl s_client command seemed to be stuck. It turns out that it doesn’t have any timeout on connection. So in the case when the monitoring service was not reachable or the connection went otherwise interrupted before the SSL handshake was done, the command stuck indefinitely. That resulted in long stripe of downtime in my monitor even if the connection was actually ok.
Although false down alarm is better than false up state, too much false alarm drives the monitor to be ignored and useless. So I needed to find a way how to make the script bullet-proof and resolve from stuck openssl request.
Normally again you would use
timeout command that is usually part of standard distributions. But in this router, similar to
curl, it is not included. So I had to implement it directly in shell script again.
Simple solution would be to run the command in backround and just add
sleep X && kill $PID after that. This would wait X second and then kill the process. If it was still running that would be what we want. In the case that command exited already, kill would just do nothing. This would solve the problem, but it would also always wait X seconds, even if the command is not stuck.
wait is built-in function in most shells, that takes first argument as PID and waits until the process is done. We can use that to put the kill sleep in background as well and after that wait for end of the process. If processes ends itself, wait will return and script will continue. If process is stuck and is killed by
kill command after sleep, wait will be resolved after that and will continue the script as well.
The final function is here and takes first argument number of seconds to timeout after and then arguments for
https_get \$2 \$3 &
( sleep \$1 && kill \$PID 2>/dev/null ) &
Hopefully this will now run forever.