Performance Tuning
The Fusion gateway proxies every GraphQL operation to one or more subgraphs over HTTP. The defaults work well out of the box, but high-throughput or latency-sensitive deployments can benefit from tuning the transport layer.
This page covers:
- Configuring the HTTP transport
- Enabling HTTP/2 for multiplexed subgraph communication
- Deduplicating identical in-flight requests
- Limiting concurrent request processing to maximize throughput
Configure the HTTP Transport
Fusion uses a named HttpClient to communicate with subgraphs. The default client name is "fusion", and you configure it through the standard IHttpClientFactory pattern. This gives you full control over connection behavior, timeouts, and message handlers.
A baseline Program.cs that registers the named client:
var builder = WebApplication.CreateBuilder(args);
// 1. Register the named HTTP client for subgraph communicationbuilder.Services.AddHttpClient("fusion");
// 2. Configure the Fusion gatewaybuilder .AddGraphQLGateway() .AddFileSystemConfiguration("./gateway.far");
var app = builder.Build();app.MapGraphQLHttp();app.Run();
- Named HTTP client
"fusion": the client the gateway uses to call subgraphs. Any handler configuration you add here applies to all subgraph requests. - Gateway registration: wires up the Fusion execution engine and loads the composed schema.
HTTP/2
HTTP/2 multiplexes multiple requests over a single TCP connection, which reduces connection overhead when the gateway sends many concurrent requests to a subgraph. This is especially beneficial when subgraphs are behind a load balancer that supports HTTP/2.
With TLS
When your subgraphs use TLS (HTTPS), HTTP/2 is negotiated automatically via ALPN. Enable EnableMultipleHttp2Connections to allow the gateway to open additional HTTP/2 connections when a single connection's stream limit is reached:
builder.Services .AddHttpClient("fusion") .ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler { EnableMultipleHttp2Connections = true, });
No additional version configuration is needed. .NET negotiates HTTP/2 over TLS by default.
Without TLS
In many Kubernetes deployments, services communicate over plaintext HTTP inside the cluster. HTTP/2 cleartext (h2c) requires explicit opt-in because .NET defaults to HTTP/1.1 for unencrypted connections.
To force HTTP/2 without TLS, set DefaultRequestVersion and DefaultVersionPolicy on the HttpClient:
builder.Services .AddHttpClient("fusion", httpClient => { httpClient.DefaultRequestVersion = HttpVersion.Version20; httpClient.DefaultVersionPolicy = HttpVersionPolicy.RequestVersionExact; }) .ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler { EnableMultipleHttp2Connections = true, });
The subgraph must also be configured to accept HTTP/2 over cleartext. By default, Kestrel only listens on HTTP/1.1 for non-TLS endpoints. Enable h2c in each subgraph's Program.cs:
builder.WebHost.ConfigureKestrel(options =>{ options.ListenAnyIP(5001, listenOptions => { listenOptions.Protocols = HttpProtocols.Http1AndHttp2; });});
If you are unsure whether your infrastructure supports HTTP/2 cleartext end-to-end, HTTP/1.1 works well for most internal deployments. Switch to HTTP/2 only when you have confirmed support on both the gateway and all subgraphs.
Request Deduplication
When multiple identical query requests are in flight to the same subgraph at the same time, request deduplication ensures only one HTTP request is actually sent. The first request becomes the "leader" and executes normally. Subsequent identical requests become "followers" that wait for the leader's response. Each caller receives an independent copy of the result.
When It Helps
Deduplication is most effective when:
- Burst traffic hits the gateway with the same query. For example, a popular product page refreshing across many clients simultaneously.
- Public APIs serve unauthenticated traffic where many users send the same queries.
- The same user sends identical concurrent requests (e.g., a UI that fires duplicate fetches).
Security Model
The deduplication hash includes the request body, URL, and the values of configurable hash headers. By default, Authorization and Cookie headers are included in the hash. This means:
- Unauthenticated or public queries: no auth headers, so identical queries from any client are deduplicated. High hit rate.
- Same user, same query, concurrent: identical tokens produce the same hash and are deduplicated.
- Different users, same query: different tokens produce different hashes. Never deduplicated. One user's response is never shared with another.
How to Enable
Add the request deduplication message handler with .AddRequestDeduplication() to the named HTTP client builder:
builder.Services .AddHttpClient("fusion") .AddRequestDeduplication();
Customizing Hash Headers
By default, the Authorization and Cookie headers are included in the deduplication hash, which covers most setups. If you need additional headers to be part of the hash, for instance a tenant identifier in a multi-tenant application, add them to HashHeaders:
.AddRequestDeduplication(options =>{ options.HashHeaders = ["Authorization", "Cookie", "X-Tenant-Id"];});
For service-to-service communication where the gateway does not receive cookies, you can remove Cookie from the hash:
.AddRequestDeduplication(options =>{ options.HashHeaders = ["Authorization"];});
What Gets Deduplicated
Only query operations are deduplicated. The following are not deduplicated:
- Mutations: not safe to coalesce because they have side effects.
- Subscriptions: long-lived connections that are inherently unique.
- File uploads: multipart requests bypass deduplication.
Concurrent Request Limiting
The gateway limits the number of simultaneous HTTP requests it processes using a concurrency gate. Capping concurrency keeps the gateway operating in its optimal throughput range. Too many requests competing for the same resources (thread pool, memory, connections) can reduce overall throughput rather than increase it.
The default limit is 64 concurrent HTTP requests. WebSocket and subscription requests bypass this limit. Depending on how many CPU cores your system has, you may want to increase or decrease this value to find the optimal throughput for your hardware. This limit does not reject requests. Instead, it queues them, and the GraphQL executor processes at most 64 concurrent requests by default.
Set the limit through ModifyServerOptions on the gateway builder:
builder .AddGraphQLGateway() .AddFileSystemConfiguration("./gateway.far") .ModifyServerOptions(options => { options.MaxConcurrentRequests = 128; });
You can override this limit for a specific HTTP endpoint using WithOptions:
app.MapGraphQLHttp() .WithOptions(options => { options.MaxConcurrentRequests = 256; });
Tuning Guidance
- Too low: requests queue behind the concurrency gate, adding latency even when gateway and subgraphs have spare capacity.
- Too high: the gateway forwards more requests than it can efficiently process, leading to thread pool starvation and increased latency.
Start with the default of 64 and adjust based on your workload. Set to null to disable the limit entirely.
Next Steps
- "I need CDN and HTTP response caching behavior": Cache Control covers
@cacheControl, composition merge behavior, and gateway response headers. - "I need to secure my gateway": Authentication and Authorization covers JWT validation, header propagation, and subgraph-level authorization.
- "I need to deploy this": Deployment & CI/CD covers production deployment patterns and CI pipeline setup.
- "I want to monitor performance": Observability and distributed tracing will be covered in future documentation.