'Gremlin Java client throws a java.util.concurrent.TimeoutException
We have a AWS Neptune db.r4.large instance which with 2vCPUs can run 4 threads and hence 4 queries maximum at a time. We use the Gremlin Java client to connect to AWS Neptune. We are seeing regular persistent java.util.concurrent.TimeoutException
exceptions while querying data.
Reading up on this AWS doc: https://docs.aws.amazon.com/neptune/latest/userguide/best-practices-gremlin-java-exceptions.html
It looks to be a case of client side throttling since we do not see any utilization of the Neptune queue with MainRequestQueuePendingRequests = 0
. We did as recommended in the post (code below). With maxConnectionPoolSize = maxSimultaneousUsagePerConnection = maxInProcessPerConnection
all set to 64, we have maxParallelQueries = 4096
. Given our db instance and query latency, this seems like an overkill but we hoped we would at least shift some of these errors on to AWS Neptune to see the queue being utilized and see some throttling exceptions. However, all the requests that reach Neptune are processed correctly. The problem remains that almost half the requests never reach AWS Neptune and error out on the client side.
Other relevant metrics are the maxWaitForConnection = 16s (default) and the avg(aws.neptune.gremlin_web_socket_open_connections) = ~10 (pretty steady). Anyone has any suggestions how we could resolve these exceptions?
@Bean("gremlinClusterRW", destroyMethod = "close")
fun gremlinClusterRW(): Cluster {
return Cluster.build()
.addContactPoint(endpointRW)
.port(port)
.channelizer(SigV4WebSocketChannelizer::class.java)
.maxConnectionPoolSize(64)
.maxInProcessPerConnection(64)
.maxSimultaneousUsagePerConnection(64)
.enableSsl(true)
.create()
}
Exception Stacktrace:
org.springframework.kafka.listener.ListenerExecutionFailedException: Listener method 'public void com.listeners.a.b(org.apache.kafka.clients.consumer.ConsumerRecord<java.lang.Long, a>)' threw exception; nested exception is java.lang.IllegalStateException: org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists; nested exception is java.lang.IllegalStateException: org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.decorateException(KafkaMessageListenerContainer.java:2114)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeErrorHandler(KafkaMessageListenerContainer.java:2106)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeRecordListener(KafkaMessageListenerContainer.java:2001)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeWithRecords(KafkaMessageListenerContainer.java:1928)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:1814)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:1531)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1178)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1075)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalStateException: org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at
...
(redacted)
...
at jdk.internal.reflect.GeneratedMethodAccessor251.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:171)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:120)
at org.springframework.kafka.listener.adapter.HandlerAdapter.invoke(HandlerAdapter.java:48)
at org.springframework.kafka.listener.adapter.MessagingMessageListenerAdapter.invokeHandler(MessagingMessageListenerAdapter.java:330)
at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:86)
at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:51)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeOnMessage(KafkaMessageListenerContainer.java:2069)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeOnMessage(KafkaMessageListenerContainer.java:2051)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeRecordListener(KafkaMessageListenerContainer.java:1988)
... 8 common frames omitted
Caused by: org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection.submitAsync(DriverRemoteConnection.java:227)
at org.apache.tinkerpop.gremlin.process.remote.traversal.step.map.RemoteStep.promise(RemoteStep.java:89)
... 36 common frames omitted
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at org.apache.tinkerpop.gremlin.driver.Client$AliasClusteredClient.submitAsync(Client.java:573)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection.submitAsync(DriverRemoteConnection.java:225)
... 37 common frames omitted
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at org.apache.tinkerpop.gremlin.driver.Client.submitAsync(Client.java:371)
at org.apache.tinkerpop.gremlin.driver.Client$AliasClusteredClient.submitAsync(Client.java:591)
at org.apache.tinkerpop.gremlin.driver.Client$AliasClusteredClient.submitAsync(Client.java:571)
... 38 common frames omitted
Caused by: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
at org.apache.tinkerpop.gremlin.driver.Client$ClusteredClient.chooseConnection(Client.java:495)
at org.apache.tinkerpop.gremlin.driver.Client$AliasClusteredClient.chooseConnection(Client.java:630)
at org.apache.tinkerpop.gremlin.driver.Client.submitAsync(Client.java:366)
... 40 common frames omitted
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|