Dynamic NGINX Upstreams from Consul via lua-nginx-module

In 2016, I delivered a short talk at DevOps Days Austin contrasting the difference between hacking and engineering utilizing the problem of resolving NGINX upstreams dynamically from Consul. This article presents the technical solution sparing you the rhetoric.

Those interested may review the slides from the talk here: Hacking vs Engineering.

Requirements

  • Must be able to present various service functionality via a single hostname to the user
  • Must be able to expand backend service capacity without running configuration management on load balancers
  • Must resolve backend services for associated URI paths from HashiCorp Consul
  • Must avoid writing to intermediate files and reloading NGINX if possible
  • Must not add more overhead by adding more services/config management
  • While a watch might be preferred in some cases, it is ok for the first release, if upstreams are simply refreshed at a regular interval

Solution

After researching pre-existing work and deciding we did not want to start with a C implementation we created a solution comprised of the following components:

How it Works

A Consul agent resides on every system. This presents us with a known local URI which we can rely upon to resolve upstreams cluster wide.

Traffic Flow

NGINX workers are single threaded event loops, this allows us to safely maintain state via the process described here.

Resolution Flow

On NGINX startup, init_worker_by_lua is called to load our consul.balancer.lua module. We cannot utilize any potentially blocking APIs at worker init, so within the balancer module we kick off the Consul upstream refresh process using ngx.timer.at. HTTP requests to the local Consul agent are handled with the lua-resty-http cosocket driver. JSON response processing is handled via lua-cjson. Following this bit of setup we now have a cosocket backed request refreshing our upstreams from Consul at the defined interval.

The rest is handled at the request and upstream processing phases. We have defined our URI fragments via standard location blocks and utilize ngx.ctx to pass the service name to the upstream. The upstream is handled by the lovely balancerbylua_block directive. This allows us to grab a reference to our balancer module, as it is already cached, and retrieve the current upstream saving state for the next request.

Getting Started

“That sounds great!” you say, but... that seems like a lot of stuff to put together. Well, sir or madam, you have come to the right place, for I have done the leg work and put together a working example that will help you explore the possibilities of dynamic load balancing with NGINX, Consul, and Lua. All of this from the safety of your own local laptop!

Fortunately for us Ubuntu 16.04 (Xenial) comes with a quite fresh NGINX package containing a lua module new enough to support our endeavor.

The NGINX Consul Lua Vagrant Lab is available on GitHub at https://github.com/sigil66/nginx-consul-vagrant

Future Work

What has been presented here is just a start. Some of the items we are planning on going over:

  • A watch based upstream push/refresh
  • Triggering refreshes or deregistration when failures exceed a threshold to a given upstream
  • Consul prepared query integration
  • More load balancing methods
  • Sticky sessions via lua
  • Health based routing and upstream backoff

Special Thanks

  • https://twitter.com/agentzh for all his work on OpenResty and NGINX CloudFlare for supporting OpenResty
  • The NGINX team for such a great web server
  • The Consul team at HashiCorp
  • BMC Software, for supporting this work

Note: this article was previously posted on Medium.