How to Expose gRPC Services Behind (Almost) Any Load Balancer

Published in

StackRox Engineering

4 min readSep 30, 2019

When we at StackRox decided to do a major architectural overhaul of our product offering almost two years ago, one of the first decisions made was to exclusively rely on gRPC for service-to-service communication. While this aligned well with our cloud-native approach (with gRPC being a CNCF incubating project), it was primarily motivated by practical concerns: gRPC’s emphasis on performance, its descriptive way of defining interfaces via Protocol Buffers with the ability to generate type-safe code in various languages, and the ability to automatically generate REST server bindings.

However, we soon realized that the gRPC protocol being implemented on top of HTTP/2 is a major hurdle when it comes to exposing services on the internet. While HTTP/2 adoption is growing, HTTP/1 is not going away in the near future. A lot of reverse proxies/load balancers still do not support HTTP/2 on the backend (such as Amazon’s ELB and ALB). Part of the reason is that one of the major benefits of HTTP/2 — multiplexing several transport streams on a single TCP connection — is at odds with the very idea of load balancing. The growing popularity of gRPC has caused reverse proxy/LB vendors to support HTTP/2 backends specifically for the gRPC use case (nginx ≥1.13.10, HAProxy ≥1.9.2), but the presence of such a setup in our customers’s environment (we are not a SaaS company fully in control of our deployment infrastructure, after all) cannot be relied on.

Issue with gRPC service exposed behind a reverse proxy/load balancer. Solid lines are requests, dashed lines are responses

The easy solution: TLS passthrough

We initially solved this problem by instructing our customers to configure their load balancers to perform TLS passthrough (in case of Amazon, requiring the use of an NLB). This ensures that the entire stream of bytes of a TCP connection between our services is received just as it was sent, and since the communication is encrypted, no (reverse) proxy or load balancer can interfere with the application-level protocol (HTTP/2 or HTTP/1).

While this approach does ensure an end-to-end HTTP/2 connection, it comes with a number of drawbacks. Since the load balancer does not terminate TLS, it also cannot take care of TLS certificate management/automated provisioning, e.g., via LetsEncrypt. Sure, functionality like this could be added to our product, but certificate provisioning needing to be configured in our product — as opposed to along with the load balancer config — makes for a less than satisfying user experience.

A better solution: gRPC-web “downgrades”

We thus needed a solution to allow talking to gRPC services exposed behind virtually any kind of HTTP load balancer, regardless of whether it supported HTTP/2 on the frontend, on the backend, or not at all.

We came across gRPC-Web, a variant of gRPC designed to be used from within a browser. Among other things, gRPC-Web explicitly does not rely on specifics of the HTTP version that is used for transport. This made it a perfect fit for the problem we needed to solve.

(Side note: HTTP/2 is important for two aspects of the gRPC specification: trailers and client streaming. Using gRPC-Web only solves the issue around trailers, but cannot help with client streaming, which is only possible via WebSockets. This however is beyond the scope of this post, and will be dealt with in a follow-up article.)

While there exists a Go implementation of a gRPC-Web reverse proxy, client libraries for gRPC-Web only exist for languages like JavaScript that are executed in the web browser. We hence decided to implement an “autosensing” logic in the server, causing it to automatically send gRPC-Web responses only when both necessary and likely to succeed:

The client injects an Accept: application/grpc, application/grpc-web header into the request, indicating it can handle both regular gRPC as well as gRPC-Web encoded responses.
The server checks whether the incoming request is a HTTP/2 request. If not, and if the gRPC method being invoked does not use client-side streaming, it checks for the presence of an Accept: application/grpc-web header.
The server sends the response, with the Content-Type header modified to contain application/grpc-web instead of the usual application/grpc. The response data is transcoded on-the-fly to the gRPC-web protocol.
The client checks for the Content-Type header in the response. If it is application/grpc, the response data is forwarded as-is; if it is application/grpc-web, the response data is transcoded on-the-fly to the regular gRPC protocol.

While the server-side part of the above protocol can easily be implemented (the Go gRPC server implementation implements the standard http.Handler interface), modifying the client behavior directly would require changes to the gRPC library. We therefore opted for a less invasive option: spawning a local HTTP/2 proxy doing the above client-side transcoding step, using Go’s awesome httputil.ReverseProxy type.

Setup with automatic downgrading of responses to gRPC-Web, based on client request format (non-HTTP/2 and with relevant “Accept” header). Solid lines are requests, dashed lines are responses.

Conclusion

It should be noted that the solution we presented is not perfect: it adds an additional hop (albeit only via local loopback), and it doesn’t support client-streaming RPC calls. Nevertheless, it greatly reduces deployment headaches by allowing our services to be exposed like any other service in our customer’s environment, without needing any special treatment due to being gRPC-based.

How to Expose gRPC Services Behind (Almost) Any Load Balancer

The easy solution: TLS passthrough

A better solution: gRPC-web “downgrades”

Conclusion

Written by Malte Isberner