'HTTP2 PING frames over AWS ALB (gRPC keepalive ping)

I'm using AWS Application Load Balancer (ALB) to expose the ASP.NET Core gRPC services. The services are running in Fargate containers and expose unsecured HTTP ports. ALB terminates the outer TLS connection and forwards the unencrypted traffic to a target group based on the route. The gRPC application has several client streaming endpoints and the client can pause the streaming for several minutes. I know that there are HTTP2 PING frames, which can be used in such cases, to keep alive the connection that has no data transmission for some amount of time.

The gRPC server is configured to send HTTP2 pings every 20 seconds for keeping the connection alive. I tested this approach and it works, the ping frames went from the server and were acknowledged by the client. But this approach fails when it comes to ALB. During the transmission pauses, I don't see any packages from the server behind the load balancer (I use Wireshark). Then after the timeout of 1 minute, the ALB resets the connection.

I tried to use client-sent HTTP2 pings as well. But the connection also resets in 1 minute and I have no evidence whether these ping packages actually reached the server behind the ALB. I have an assumption that AWS ALB doesn't allow such packets to pass over it, but I didn't find any documentation that proves it.



Solution 1:[1]

ALB forwards requests based on HTTP protocol semantics, and not raw HTTP/2 frames. Therefore something like ping frames will only apply for one of the hops.

If you want an end to end ping, you could define a gRPC API which is performing the ping. For server to client you would be required to use a server side streaming APIs. But it might actually be preferrable to let the clients start the pings, to reduce the worker the server has to perform.

Solution 2:[2]

The AWS support team responded to my ticket and the short answer is ALB does not support the HTTP2 ping frames. They suggested increasing the value of idle timeout on the load balancer, but this solution may be not applicable in some cases.

As Matthias247 already mentioned, the possible workaround is to define a gRPC API for the purpose of doing a ping.

Solution 3:[3]

We mock gRPC client with Golang with HTTP2 ping frames configuration as below

    conn, err := grpc.Dial(ServerAddress,
        grpc.WithTransportCredentials(cred),
        grpc.WithBlock(),
        grpc.WithKeepaliveParams(keepalive.ClientParameters{
            Time:                time.Second * 11,
            Timeout:             time.Second * 3,
            PermitWithoutStream: true,
        }),
    )
    if err != nil {
        log.Fatalf("net.Connect err: %v", err)
    }
    defer conn.Close()

    grpcClient := protocol.NewChatClient(conn)
    ctx, cancel := context.WithTimeout(context.Background(), time.Millisecond*3000)
    defer cancel()
    stream, err := grpcClient.Stream(ctx)
    if err != nil {
        log.Fatalf("get Bidirectional stream err: %v", err)
    }

    for i := 0; i < 2; i++ {
        // send stream message
        stream.Send(&protocol.StreamRequest{
            // ...
        })

        if err != nil {
            log.Fatalf("stream request err: %v", err)
        }

        res, err := stream.Recv()
        if err != nil {
            log.Fatalf("Conversations get stream err: %v", err)
        }

        time.Sleep(time.Duration(63) * time.Second)
    }

    select {}

And run the mock client with GODEBUG=http2debug=2.

After 3 seconds later, the client sends RST_STREAM to ALB

http2: Framer xxx: wrote RST_STREAM stream=1 len=4 ErrCode=CANCEL

After that, we notice the ping frames are sent from mock client every 11 seconds.

http2: Framer xxx: wrote PING len=8 ping="\x00\x00\x00\x00\x00\x00\x00\x00"
http2: Framer xxx: read PING flags=ACK len=8 ping="\x00\x00\x00\x00\x00\x00\x00\x00"

Per doc

gRPC sends http2 pings on the transport to detect if the connection is down

IMO, ALB could handle the PING frame correctly, and keep the gRPC connection alive, except that the stream is close by the client. Also, the Idle timeout of ALB could just close the idle stream, and the PING frame could keep the gRPC connection alive.


In order to reuse the gRPC connection, maybe we can try it in this way, send each request with a different stream separately.

    grpcClient := protocol.NewChatClient(conn)
    ctx, cancel := context.WithTimeout(context.Background(), time.Millisecond*3000)
    defer cancel()

    for i := 0; i < 2; i++ {
        stream, err := grpcClient.Stream(ctx)
        if err != nil {
            log.Fatalf("get BidirectionalHello stream err: %v", err)
        }

        // send stream message
        stream.Send(...)
     }

ALB Controller version aws-alb-ingress-controller:v2.1.3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Matthias247
Solution 2 Akenolt
Solution 3