'Google pub/sub ERROR com.google.cloud.pubsub.v1.StreamingSubscriberConnection
I have a snowplow enricher application hosted in GKE consuming messages from google pub/sub subscription and the enricher application is throwing the below error.
I can see num_undelivered_messages count spiking(going above 50000) in the pub/sub subscription 3-4 times a day and i presume these error messages are occurring as enricher application is unable to fetch messages from the mentioned subscription.
Why is the application unable to connect to pub/sub subscription at times?
Any help is really appreciated.
Apr 12, 2022 12:30:32 PM com.google.cloud.pubsub.v1.StreamingSubscriberConnection$2 onFailure
WARNING: failed to send operations
com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: 502:Bad Gateway
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1050)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1176)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:969)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:760)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:545)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:515)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
at io.grpc.internal.ClientCallImpl.access$500(ClientCallImpl.java:66)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:689)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$900(ClientCallImpl.java:577)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:751)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:740)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: 502:Bad Gateway
at io.grpc.Status.asRuntimeException(Status.java:533)
... 15 more
Solution 1:[1]
The accumulation of messages in the subscriptions suggests that your subscribers are not keeping up with the flow of messages.
To monitor your subscribers, you can create a dashboard that contains backlog metrics: num_undelivered_messages and oldest_unacked_message_age (age of the oldest unacked message in the subscription's backlog) aggregated by resource for all your subscription.
If both the
oldest_unacked_messageandnum_undelivered_messagesare growing it is because the subscribers are not keeping up with the message volume.Solution: Add more subscriber threads/ machines and look for any bugs in your code which might prevent acknowledging messages.
If there is a steady, small backlog size with a steadily growing
oldest_unacked_message_age, there may be a small number of messages that cannot be processed. This can be due to the messages getting stuck.Solution: Check your application logs to understand whether some messages are causing your code to crash. It's unlikely—but possible —that the offending messages are stuck on Pub/Sub rather than in your client.
If the
oldest_unacked_message_ageexceeds the subscription's message retention duration there are high chances of data loss; in that case the best option is to set up alerts to fire before subscription's message retention duration lapses.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sakshi Gatyan |
