Service Mesh & Proxies · Study Guide

Envoy Proxy, from listeners to TLS 1.3

A structured walk through Hussein Nasser's Envoy crash course: the architecture and mental model first (downstream/upstream, clusters, listeners, filters, connection pools, threading), then a hands-on build of Envoy as an L7 and L4 proxy with HTTPS, HTTP/2, and a hardened TLS configuration.

~1h 07m runtime L7 & L4 proxying TLS 1.2 / 1.3 HTTP/2

Source video: Envoy Proxy Crash Course by Hussein Nasser · youtu.be/40gKzHQWgP0
Demo configs referenced in the video live in his javascript_playground repo.

◈ Why this matters for your stack

Envoy is the data plane under Istio — every sidecar in your mesh is an Envoy process, and Pilot/istiod simply pushes it configuration over the xDS APIs. Listeners, clusters, routes, and filter chains are exactly the objects Istio generates from your VirtualService, DestinationRule, and AuthorizationPolicy resources. Knowing raw Envoy makes the output of istioctl proxy-config legible.

01

What is Envoy?

▸ 0:00

Envoy is a high-performance, open-source L7 proxy and communication bus for service-oriented architectures. It was originally written in C++ at Lyft to help move the company off a monolith, and is now a CNCF-graduated project that underpins most modern service meshes.

The defining idea is that the network should be transparent to applications. Rather than every microservice re-implementing retries, timeouts, load balancing, TLS, circuit breaking, and observability — once per language, with subtle differences — Envoy runs out of process next to the app and absorbs all of that responsibility.

  • Out-of-process / sidecar: Envoy is a separate process. Your app talks to localhost and Envoy handles the network. This means it works with any language and can be upgraded independently.
  • L7-aware: it understands HTTP/1.1, HTTP/2, gRPC, and more, so it can route, retry, and observe based on the actual content of requests — not just IP/port.
  • Dynamically configurable: the xDS family of APIs (LDS, CDS, EDS, RDS, SDS) lets a control plane push config at runtime with no restarts.
  • Observable by design: rich stats, access logs, and distributed tracing are first-class, not bolted on.
02

Current vs. desired architecture

▸ 0:48

The motivating story is the monolith-to-microservices migration. In a monolith, "calling another component" is a function call — there is no network, no partial failure, no per-service retry logic. As you split that monolith into services, every one of those in-process calls becomes a network call, and suddenly each service needs to handle service discovery, load balancing, timeouts, retries, TLS, and metrics.

Doing that inside each application is painful: the logic is duplicated across languages, evolves inconsistently, and ties networking concerns to your business code. The desired architecture pushes all of it down into a uniform proxy layer. Each service gets an Envoy alongside it; the services speak plain local HTTP, and the mesh of Envoys forms a consistent, observable, controllable communication fabric.

◆ Key takeaway

Envoy's value proposition is consistency and separation of concerns: networking behavior is defined once, in one place, in one language-agnostic process — instead of N times across your services.

03

Envoy architecture

▸ 3:00

At a high level, traffic flows through Envoy along a fixed pipeline. A client connection lands on a listener, passes through a filter chain that can inspect and act on it, and is finally routed to an endpoint inside a cluster via a connection pool. Holding this shape in your head makes every later config block obvious.

Client downstream ENVOY Listener :443 Filter chain Cluster + conn pool endpoint A endpoint B upstream
The Envoy request path: listener → filter chain → cluster (load-balanced over endpoints)

Each of these objects can be supplied statically in a YAML file or dynamically from a management server through the matching xDS API:

LDS
Listener Discovery Service — dynamic listeners
CDS
Cluster Discovery Service — dynamic clusters
EDS
Endpoint Discovery Service — dynamic endpoints within a cluster
RDS
Route Discovery Service — dynamic HTTP route tables
SDS
Secret Discovery Service — dynamic TLS certificates/keys
04

Downstream & upstream

▸ 7:30

This terminology trips up newcomers, so anchor it firmly — it appears everywhere in the docs and config. Both terms are relative to Envoy itself:

Downstream
The host that connects to Envoy — the client. It sends requests and receives responses. (Think: closer to the user.)
Upstream
The host that Envoy connects to on the client's behalf — the backend. It receives requests and sends responses. (Think: closer to the data.)
⚠ Common confusion

"Upstream" feels backwards because requests flow toward the upstream while responses flow down from it. Don't tie it to request direction — tie it to who initiates the connection. Envoy accepts downstream connections and originates upstream connections.

05

Clusters

▸ 9:19

A cluster is a named group of logically equivalent upstream hosts that Envoy load-balances across — e.g. "all replicas of the checkout service." A cluster owns several responsibilities:

  • Service discovery — how the member endpoints are found: STATIC (hard-coded IPs), STRICT_DNS (resolve a name, every A/AAAA record is an endpoint), LOGICAL_DNS (resolve lazily, keep one connection), or EDS (pushed by a control plane).
  • Load balancing — round robin, least request, ring hash, Maglev, random.
  • Active health checking — probe endpoints and eject unhealthy ones.
  • Circuit breaking — cap concurrent connections/requests/retries to protect the upstream.
clusters.yamlclusters:
  - name: app1_cluster
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: app1_cluster
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: app1   # resolvable hostname
                    port_value: 8080
◈ Istio mapping

In a mesh, Istio generates one Envoy cluster per Service + subset. istioctl proxy-config clusters <pod> dumps exactly these objects, and a DestinationRule is what sets the lb_policy, circuit breakers, and TLS mode on them.

06

Listeners

▸ 10:50

A listener is a named network location — typically an IP and port (or a Unix domain socket) — where Envoy accepts downstream connections. An Envoy can expose many listeners at once (e.g. :80 for plaintext, :443 for TLS, :9901 for the admin interface).

Each listener carries one or more filter chains. Incoming connections are matched to a chain (by SNI, source IP, ALPN, etc.) and then handed through that chain's filters. The listener is the entry point; the filter chain decides what actually happens to the bytes.

listener.yamllisteners:
  - name: http_listener
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 80
    filter_chains:
      - filters: [ # ... network filters go here ... ]
07

Network filters

▸ 11:50

Filters are where Envoy's behavior actually lives. They are arranged in chains and come in three kinds, applied at different depths:

Listener filters
Run first, on the raw connection before a filter chain is chosen — e.g. TLS inspector (peek at SNI), original-destination, proxy protocol.
Network (L4) filters
Operate on the raw byte stream of a connection. Examples: tcp_proxy, and crucially the http_connection_manager.
HTTP (L7) filters
Run inside the HTTP connection manager, on parsed HTTP messages — e.g. router, CORS, JWT auth, rate limiting, ext_authz.

The pivotal one is the HTTP Connection Manager (HCM). It's technically a network (L4) filter, but its job is to translate the raw TCP stream into HTTP messages and then run an L7 HTTP filter chain on them. The L7 chain always terminates in the router filter, which performs the actual route match and forwards to a cluster.

Listener accepts conn HTTP CONNECTION MANAGER (L4 network filter) parse bytes→HTTP L7 filters auth, cors… router → cluster Cluster
HCM is an L4 filter that hosts an L7 filter chain ending in the terminal router filter

The filter programming model

Network (L4) filters see a connection as a bidirectional stream of bytes, and they come in three flavours depending on which direction they touch:

Read filter
Processes data arriving from the downstream (the request side). Implements onNewConnection() and onData(buffer, end_stream).
Write filter
Processes data Envoy is about to send back to the downstream (the response side). Implements onWrite(buffer, end_stream).
Read/Write filter
Both — the HCM and tcp_proxy are duplex filters that own the whole connection.

Every filter callback returns a filter status that controls the chain. This is the mechanism behind async authorization, buffering, and rate limiting:

Continue
Pass the (possibly mutated) data to the next filter in the chain.
StopIteration
Halt the chain here — e.g. while buffering more bytes or awaiting an async call (an external auth check). The filter later calls continueReading() / continueDecoding() to resume.
◆ Terminal filter rule

Every network filter chain must end in a terminal filter — one that actually forwards the connection somewhere (http_connection_manager or tcp_proxy). Filters before it (RBAC, rate limit, a protocol sniffer) inspect/act and then Continue; the terminal filter consumes the stream and never continues past itself.

Filter-chain matching & listener filters

A single listener can hold many filter chains, and FilterChainMatch selects exactly one per incoming connection. This is how one :443 listener serves multiple certificates and routes TLS by hostname at L4, without decrypting. Matching can key on:

  • Destination — port and IP/CIDR the connection arrived on.
  • SNI (server_names) and ALPN (application_protocols) — but only if a listener filter extracted them first.
  • Transport protocoltls vs raw_buffer.
  • Source — type (local/external), IP/CIDR, and port range.

The criteria are evaluated by a fixed specificity order (destination port → destination IP → server name → transport → application protocol → source), and the most specific match wins. The SNI/ALPN data only exists because listener filters ran first on the raw socket:

tls_inspector
Peeks at the TLS ClientHello to extract SNI and ALPN before any chain is chosen.
http_inspector
Sniffs whether the connection is HTTP/1.x or HTTP/2 (h2c).
original_dst
Recovers the pre-NAT destination (the SO_ORIGINAL_DST socket option) — the backbone of transparent/sidecar interception.
proxy_protocol
Parses the PROXY protocol v1/v2 header to recover the real client IP behind another LB.

Catalog of network (L4) filters

Beyond the two stars (HCM and tcp_proxy), Envoy ships protocol-aware L4 filters that give you stats, routing, and fault injection for non-HTTP wire protocols:

http_connection_manager
The L7 HTTP engine — codec + HTTP filter chain. (Detailed below.)
tcp_proxy
Blind L4 stream forwarding to a single cluster; supports weighted clusters and idle timeouts.
redis_proxy
Protocol-aware Redis proxy: command splitting, hash-slot routing across a shard cluster, per-command stats.
mongo_proxy
Parses the Mongo wire protocol for per-op stats, slow-query logging, and fault injection.
mysql / postgres_proxy
Decode the DB protocols to emit query/transaction metrics (no routing logic).
thrift_proxy
Full L7 routing for Apache Thrift RPC (method-based routes, retries).
dubbo_proxy
L7 routing for the Dubbo RPC protocol.
rate_limit
Connection-level global rate limiting via an external RLS service.
local_ratelimit
In-process token-bucket limiting on L4 connections.
rbac
Allow/deny connections by source/destination IP, SNI, or mTLS principal — authz before HTTP exists.
sni_cluster / sni_dynamic_forward_proxy
Route to an upstream cluster chosen directly from the TLS SNI value.
echo
A trivial filter that echoes bytes back — handy for debugging the chain itself.
wasm
Run a custom L4 filter compiled to WebAssembly.

HTTP Connection Manager, in depth

The HCM is the most important filter in Envoy — it is what turns a byte-pushing L4 proxy into a request-aware L7 proxy. When a connection matches a chain whose terminal filter is the HCM, the HCM takes ownership and performs a long list of jobs:

  1. Instantiates a codec. Based on codec_type (AUTO, HTTP1, HTTP2, HTTP3) it picks how to frame bytes. AUTO uses ALPN (or a magic-byte sniff) to detect the version. Whatever the wire version, the codec normalizes everything into a uniform internal model of headers → data → trailers — which is why an HTTP/1.1 request and an HTTP/2 stream flow through identical filter code.
  2. Demultiplexes streams. An HTTP/2 or HTTP/3 connection carries many concurrent streams; the HCM creates an independent per-stream HTTP filter chain instance for each one.
  3. Resolves routes. It picks a route configuration (inline, via RDS, or via scoped-RDS), matches a virtual host by the :authority/Host header, then matches a route within it.
  4. Manages request lifecycle headers. Generates x-request-id, makes the sampling/tracing decision, manipulates x-forwarded-for / x-forwarded-proto, adds via, and enforces header sanitization.
  5. Emits observability. Access logs at stream completion, tracing spans, and the rich http.<stat_prefix>.* stats family.
  6. Enforces connection & stream timeouts and handles graceful draining (HTTP/2 GOAWAY).

Decoder vs. encoder filters

Inside the HCM, HTTP (L7) filters are split by direction, mirroring the L4 read/write split:

Decoder filter
Acts on the request travelling toward the upstream — decodeHeaders / decodeData / decodeTrailers.
Encoder filter
Acts on the response travelling back to the downstream — encodeHeaders / encodeData / encodeTrailers.
Dual filter
Implements both (e.g. the router, faults, RBAC).

Crucially, decode runs in declared order; encode runs in reverse order. The request descends through filters to the terminal router, which dispatches it upstream; the response then ascends back through the same filters bottom-to-top. Header callbacks return an HeadersStatusContinue, StopIteration (pause for an async call), or StopAllIterationAndBuffer (pause and buffer the whole body).

REQUEST · decode ↓ RESPONSE · encode ↑ jwt_authn (decoder) cors (dual) compressor (encoder) router (terminal) upstream cluster router dispatches → response returns
Decode descends in order to the router; encode ascends back up in reverse order

Catalog of HTTP (L7) filters

router
Terminal. Route match, retries, timeouts, request mirroring/shadowing, redirects, host rewrite, upstream dispatch.
ext_authz
Calls an external HTTP or gRPC authorization service; allows/denies per request.
jwt_authn
Verifies JWTs against JWKS, enforces issuer/audience, forwards claims as metadata.
rbac
Allow/deny on path, headers, method, or mTLS/JWT principal.
rate_limit / local_ratelimit
Global (RLS service) or in-process token-bucket limiting, keyed by descriptors.
cors
Handles CORS preflight and response headers.
fault
Injects delays and aborts (incl. gRPC status) for resilience testing.
compressor / decompressor
gzip / brotli / zstd of request or response bodies.
buffer
Buffers the full body before forwarding (needed by some auth flows; harmful to streaming).
lua / wasm
Custom logic in Lua scripts or sandboxed WebAssembly modules.
ext_proc
Streams headers/body to an external gRPC server that can mutate them mid-flight.
oauth2
Implements the OAuth2 auth-code flow at the proxy edge.
grpc_web · grpc_json_transcoder · grpc_stats · grpc_http1_bridge
The gRPC family — see below.

Connection & stream timeouts

The HCM exposes a layered set of timeouts; mixing them up is a frequent source of mysterious resets:

request_timeout
Time to receive the entire downstream request (headers+body). Off by default.
route timeout
Per-route end-to-end upstream timeout. Defaults to 15s — the classic killer of long gRPC/SSE streams. Set timeout: 0s to disable.
stream_idle_timeout
Resets a stream with no activity (default 5m). Must be raised/zeroed for long-lived streams.
idle_timeout
(in common_http_protocol_options) closes an idle connection.
drain_timeout / max_connection_duration
Graceful drain window and a hard cap on connection age (forces periodic reconnects/rebalancing).

gRPC: there is no separate "gRPC connection manager"

⚠ Clarify the mental model

Envoy does not have a distinct gRPC network filter. gRPC is just HTTP/2 with a particular framing (length-prefixed protobuf messages) and a reliance on HTTP trailers (grpc-status, grpc-message). So gRPC is handled by the same HTTP Connection Manager running in HTTP/2 mode — plus a set of gRPC-aware HTTP filters and routing features layered on top.

The HCM is a natural gRPC proxy precisely because it already speaks full HTTP/2: long-lived streams, bidirectional flow, and trailers. A gRPC call is an HTTP/2 POST with content-type: application/grpc, a path of /package.Service/Method, length-prefixed message frames in the body, and a final grpc-status trailer. Envoy understands all of this.

gRPC-aware features in the HCM

  • Routing: a route match can specify grpc: {} to match only gRPC traffic, and match on the method via the path prefix /pkg.Service/.
  • Retries on gRPC status: retry_policy.retry_on understands gRPC semantics — cancelled, deadline-exceeded, resource-exhausted, unavailable, internal — read from the trailing grpc-status, not the HTTP code (which is almost always 200).
  • Deadlines: the client's grpc-timeout header can be honoured/propagated as the route timeout via grpc_timeout_header_max.
  • Stats: the grpc_stats filter emits per-service, per-method, per-status counters and can surface grpc-status in access logs.
  • Health checking: upstream clusters can use the standard grpc.health.v1.Health check.
grpc-route.yamlroutes:
  - match:
      prefix: "/helloworld.Greeter/"
      grpc: {}                 # match gRPC requests only
    route:
      cluster: greeter_cluster
      timeout: 0s               # disable 15s default for streaming!
      retry_policy:
        retry_on: "cancelled,deadline-exceeded,resource-exhausted,unavailable"
        num_retries: 3

The gRPC translation filters

These are all HTTP filters that sit in the HCM's L7 chain. They exist to bridge clients that can't speak native HTTP/2 gRPC:

grpc_web
Translates gRPC-Web (the browser-friendly variant — works over HTTP/1.1, carries trailers inside the body, base64-able) to/from standard gRPC, so a JS frontend can call a gRPC backend directly.
grpc_json_transcoder
Exposes a RESTful JSON HTTP/1.1 API and transcodes it to gRPC, driven by the compiled proto descriptor set and google.api.http annotations. Lets plain REST clients hit gRPC services.
grpc_http1_bridge
Lets an HTTP/1.1 client send a unary gRPC call (Envoy upgrades it to HTTP/2 upstream and folds the trailers into the response).
grpc_http1_reverse_bridge
The inverse — lets a gRPC client/Envoy talk to a backend that only does HTTP/1.1 gRPC.
grpc_stats
Per-method/status gRPC telemetry.

Streaming gotchas

gRPC has four call modes — unary, server-streaming, client-streaming, and bidirectional. Envoy maps each to an HTTP/2 stream, but long-lived streams expose default behaviours that silently break them:

  • Never put a buffer filter in front of a stream — buffering waits for end_stream, which for a streaming RPC may be minutes away (or never).
  • Set timeout: 0s on the route and raise/zero stream_idle_timeout; otherwise the 15s route default and 5m idle default kill the stream.
  • Enable HTTP/2 keepalive (connection_keepalive PINGs) on both the listener and upstream cluster so idle bidi streams survive NAT/idle reaping.
  • Tune flow controlinitial_stream_window_size / initial_connection_window_size and max_concurrent_streams govern throughput and fairness for many concurrent streams.
◈ Istio mapping

Istio sidecars run the HCM in HTTP/2/gRPC mode automatically once a port is named grpc (or app protocol is detected). VirtualService exposes gRPC route matching and retries.retryOn: "unavailable,deadline-exceeded,...", while gRPC-Web, JSON transcoding, and ext_authz/jwt filters are wired in through EnvoyFilter patches that insert exactly these HTTP filters into the chain. When a long-lived gRPC stream dies at 15s in the mesh, it's this route-timeout default biting — fix it with a per-route timeout, not a cluster setting.

08

Connection pools

▸ 13:45

Opening a TCP (and TLS) connection per request is expensive — handshakes, slow start, certificate negotiation. So Envoy keeps a connection pool per upstream host, per worker thread, and reuses warm connections. The pool's shape depends on the protocol:

  • HTTP/1.1: one in-flight request per connection. To get concurrency the pool holds many connections and hands requests out to idle ones.
  • HTTP/2: requests are multiplexed as independent streams over a single connection, so a small number of connections (often one) can carry huge concurrency. The pool tracks max concurrent streams.
◆ Key takeaway

Pools are per worker thread (see next section). That has a subtle consequence: load-balancing decisions and connection reuse happen independently on each worker, which is why Envoy's LB is statistically even rather than globally coordinated.

09

Threading model

▸ 18:34

Envoy uses a single main thread plus N worker threads (typically one per CPU core). This design is what lets it stay fast and almost entirely lock-free.

Main thread
Owns startup, the xDS config machinery, stats flushing, and admin. It does not handle data-plane traffic.
Worker threads
Each runs its own non-blocking event loop (libevent) and handles a share of connections. They are "embarrassingly parallel."

The critical rule: a connection is pinned to one worker thread for its entire lifetime. Once a worker accepts a connection, every read, write, filter invocation, and upstream connection for it stays on that thread. No connection is ever handed between threads, so there's no locking on the hot path.

Main thread xDS · stats · admin Thread-Local Storage — lock-free config snapshots pushed to workers Worker 0 event loop · conns Worker 1 event loop · conns Worker 2 event loop · conns Worker N
One main thread coordinates; workers run independent event loops and own their connections

Config can change while traffic flows. Envoy handles this with Thread-Local Storage (TLS): the main thread builds a new immutable config snapshot and posts it to each worker, which swaps it in atomically at a safe point. Workers read their local copy with no locks; that's how you get hot config reloads without dropping connections.

10

Hands-on: Envoy as an L7 proxy

▸ 21:25 – 39:00 · setup, install, route to 4 backends

The demo runs four Node.js services and fronts them with one Envoy. On macOS it's installed with brew install envoy (or via the getenvoy.io packages), then run with envoy -c envoy.yaml.

Acting at L7 means using the HTTP Connection Manager with a route configuration: a set of virtual hosts, each holding routes that match on the request (usually a path prefix) and name a target cluster. Below, requests are routed to four backends by path prefix.

l7-route.yamlfilters:
  - name: envoy.filters.network.http_connection_manager
    typed_config:
      "@type": type.googleapis.com/...HttpConnectionManager
      stat_prefix: ingress_http
      http_filters:
        - name: envoy.filters.http.router   # terminal filter
      route_config:
        virtual_hosts:
          - name: backend
            domains: ["*"]
            routes:
              - match: { prefix: "/app1" }
                route: { cluster: app1_cluster }
              - match: { prefix: "/app2" }
                route: { cluster: app2_cluster }

The match is evaluated top to bottom, first match wins, so order your routes from most specific to least specific.

11

Splitting load across backends

▸ 40:00

There are two distinct ways to spread traffic, and it's worth keeping them separate:

  • Within a cluster — multiple endpoints behind one cluster name are balanced automatically by the cluster's lb_policy. This is plain load balancing across replicas.
  • Across clusters (traffic splitting) — a single route can fan out to weighted_clusters, sending a defined percentage to each. This is how you do canaries and A/B rollouts.
weighted.yamlroutes:
  - match: { prefix: "/" }
    route:
      weighted_clusters:
        clusters:
          - { name: app1_cluster, weight: 80 }
          - { name: app2_cluster, weight: 20 }
◈ Istio mapping

An Istio VirtualService with two destination + weight entries compiles to exactly this weighted_clusters block — the foundation of every Istio canary.

12

Blocking certain requests (/admin)

▸ 45:30

Because routing is L7, you can refuse requests by path before they ever touch a backend. Add a route that matches the sensitive prefix and returns a direct_response instead of forwarding:

block-admin.yamlroutes:
  - match: { prefix: "/admin" }
    direct_response:
      status: 403
      body: { inline_string: "Forbidden" }
  - match: { prefix: "/" }      # everything else proceeds
    route: { cluster: app1_cluster }

Keep the /admin rule above the catch-all / rule — first match wins, so a broad rule placed first would shadow it.

13

Envoy as an L4 proxy (TCP router)

▸ 47:50

Sometimes you don't want or need HTTP awareness — you just want to forward a raw TCP stream (databases, custom protocols, TLS passthrough). For that, swap the HTTP Connection Manager for the tcp_proxy network filter. There's no route table and no path matching — the whole connection goes to one cluster.

l4-tcp.yamlfilter_chains:
  - filters:
      - name: envoy.filters.network.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/...TcpProxy
          stat_prefix: tcp
          cluster: app1_cluster
◆ L4 vs L7 — the trade-off

L4 is cheaper and protocol-agnostic, but blind: no path routing, no HTTP retries, no per-request metrics. L7 costs CPU to parse but unlocks content-based routing, retries, header manipulation, and rich observability. Choose L4 for opaque streams, L7 when you need to reason about requests.

14

Enabling HTTPS / TLS termination

▸ 54:00 DNS · 55:30 Let's Encrypt

To serve HTTPS, the video points a DNS record at the host, obtains a certificate from Let's Encrypt, and configures Envoy to terminate TLS — meaning downstream clients connect over TLS, Envoy decrypts, and forwards plaintext to the local backends.

TLS is configured via a transport_socket on the listener's filter chain. The downstream context points at the fullchain certificate and private key:

tls.yamlfilter_chains:
  - transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/...DownstreamTlsContext
        common_tls_context:
          tls_certificates:
            - certificate_chain: { filename: /etc/letsencrypt/.../fullchain.pem }
              private_key:      { filename: /etc/letsencrypt/.../privkey.pem }
    filters: [ # ... http_connection_manager as before ... ]

Terminating at Envoy is the common case (it also enables L7 features on encrypted traffic). If you instead want Envoy to forward encrypted bytes untouched, that's TLS passthrough — handled at L4 with tcp_proxy and the TLS inspector reading SNI without decrypting.

15

HTTP/2 and TLS hardening

▸ 1:03:00 HTTP/2 · 1:04:30 TLS 1.2/1.3 only · 1:06:40 SSL Labs

Enabling HTTP/2

HTTP/2 is negotiated per hop. On the downstream side, enable it on the HCM; on the upstream side, set it on the cluster. With TLS, the protocol is selected via ALPN, so advertise h2 there too.

http2-and-hardening.yaml# downstream: HCM
http2_protocol_options: {}

# ALPN advertises h2 then http/1.1 fallback
common_tls_context:
  alpn_protocols: ["h2", "http/1.1"]
  # restrict protocol versions — drop legacy TLS
  tls_params:
    tls_minimum_protocol_version: TLSv1_2
    tls_maximum_protocol_version: TLSv1_3

Disabling TLS 1.0/1.1, allowing only 1.2 & 1.3

Old TLS versions (1.0, 1.1) are deprecated and insecure. Pinning tls_minimum_protocol_version to TLSv1_2 and the maximum to TLSv1_3 refuses anything weaker. You can also restrict cipher suites for an even tighter posture.

Verifying with SSL Labs

The course finishes by running the endpoint through Qualys SSL Labs, which grades the TLS configuration (protocol versions, ciphers, certificate chain, forward secrecy). A clean modern config — TLS 1.2/1.3 only, valid Let's Encrypt chain — lands an A/A+.

⚠ Gotcha

HTTP/2 has its own ALPN/cipher requirements. If you enable h2 but forget to advertise it in alpn_protocols, clients silently fall back to HTTP/1.1 over the same port — the connection works, but you never get the multiplexing you configured.

16

Summary & cheat sheet

▸ 1:07:24

The whole course reduces to one mental model and a handful of objects. Internalize the pipeline and the config writes itself:

The pipeline
listener → filter chain → (HCM → L7 filters → router) → cluster → connection pool → endpoint
Listener
where Envoy accepts downstream connections (IP:port)
Cluster
group of upstream endpoints + discovery + LB + health + circuit breaking
Filters
listener (raw) → network/L4 (e.g. tcp_proxy, HCM) → HTTP/L7 (router is terminal)
L7 vs L4
HCM = content-aware routing/observability; tcp_proxy = blind stream forwarding
Threading
1 main + N workers; a connection is pinned to one worker; TLS for lock-free config swaps
Pools
per-host, per-worker; HTTP/1.1 = many conns, HTTP/2 = multiplexed streams
TLS
transport_socket on the filter chain; terminate to unlock L7; pin to 1.2/1.3
xDS
LDS / CDS / EDS / RDS / SDS — the dynamic-config APIs a control plane (e.g. istiod) drives

Self-test

▸ check your recall — answers hidden
Q1Relative to Envoy, what is "downstream" vs "upstream"?
Downstream is the client that connects to Envoy (sends requests); upstream is the backend Envoy connects to. The split is defined by who originates the connection, not by request direction.
Q2What is a cluster responsible for?
A named group of equivalent upstream endpoints, plus service discovery (STATIC/STRICT_DNS/LOGICAL_DNS/EDS), load balancing policy, active health checking, and circuit breaking.
Q3Why is the HTTP Connection Manager described as an L4 filter that does L7 work?
It sits in the network (L4) filter chain operating on the raw byte stream, but its job is to parse those bytes into HTTP and then run an L7 HTTP filter chain — which always ends in the terminal router filter.
Q4What guarantees Envoy stays lock-free on the data path?
A connection is pinned to a single worker thread for its lifetime, so no connection state is shared between threads. Config changes are distributed as immutable snapshots via Thread-Local Storage, read without locks.
Q5How does an HTTP/2 connection pool differ from an HTTP/1.1 one?
HTTP/1.1 allows one in-flight request per connection, so the pool keeps many connections for concurrency. HTTP/2 multiplexes many concurrent streams over a single connection, so far fewer connections are needed.
Q6Two ways to spread traffic, and when to use each?
Load-balance multiple endpoints within one cluster (plain replica balancing via lb_policy), or split across multiple clusters with weighted_clusters on a route (canaries / A-B).
Q7How do you block /admin at L7, and what ordering pitfall applies?
Add a route matching the /admin prefix with a direct_response (e.g. 403). Routes match top-to-bottom, first match wins, so the specific /admin rule must sit above any catch-all / rule.
Q8When would you pick the tcp_proxy (L4) filter over the HCM (L7)?
For opaque or non-HTTP traffic (databases, custom protocols, TLS passthrough) where you only need to forward a stream and don't need path routing, per-request retries, or HTTP-level metrics.
Q9Where is TLS configured, and what two values pin you to modern protocols?
In a transport_socket on the filter chain. Set tls_minimum_protocol_version: TLSv1_2 and tls_maximum_protocol_version: TLSv1_3 to refuse TLS 1.0/1.1.
Q10What silently breaks HTTP/2 even when you've enabled it?
Forgetting to advertise h2 in alpn_protocols on the TLS context — clients then negotiate HTTP/1.1 over the same port and you lose multiplexing without any error.