'Getting exception com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target
Single hazelcast node is running which means only one hazelcast member. We are getting below exception once a while. Means say system is continuously running for 10 days and we start receiving this exception on 10th day onward.
Hazelcast version is 3.12.10
The function call which results in this exception: return resolveExecutor(task) .submitToMember(new HazelcastAdapterTask(key, task), cluster.getLocalMember());
If hazelcast instance is down, then System will throw com.hazelcast.core.HazelcastInstanceNotActiveException: Hazelcast instance is not active!.
No clues why we are getting com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target:
After restart, everything works fine.
1. Does anyone know the reason of this?
2. For multiple nodes, its understood that the node on which Hazelcast cluster is trying to submit the task is somehow went down. Why this exception when single node is running?
3. Any debugging logs we can enable on hazelcast?
We tried to reproduce the issue, but not able to reproduce this.
Caused by: com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target: [10.232.104.29]:44536, partitionId: -1, operation: com.hazelcast.executor.impl.operations.MemberCallableTaskOperation, service: hz:impl:executorService at com.hazelcast.spi.impl.operationservice.impl.Invocation.initInvocationTarget(Invocation.java:307) at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:614) at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:592) at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:256) at com.hazelcast.spi.impl.operationservice.impl.OperationServiceImpl.invokeOnTarget(OperationServiceImpl.java:326) at com.hazelcast.executor.impl.ExecutorServiceProxy.submitToMember(ExecutorServiceProxy.java:319) at com.hazelcast.executor.impl.ExecutorServiceProxy.submitToMember(ExecutorServiceProxy.java:308) at com.tpt.atlant.grid.task.hazelcast.HazelcastTaskExecutor.submit(HazelcastTaskExecutor.java:166) at com.tpt.valuation.ion.service.GenerateUpdateObligationTalkFunctionImpl.delegateTask(GenerateUpdateObligationTalkFunctionImpl.java:98) at com.tpt.valuation.ion.service.GenerateUpdateObligationTalkFunctionImpl.lambda$generateUpdateObligation$0(GenerateUpdateObligationTalkFunctionImpl.java:72) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at com.iontrading.isf.executors.impl.monitoring.e.run(MonitoredRunnable.java:16) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) at ------ submitted from ------.(Unknown Source) at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolve(InvocationFuture.java:126) at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveAndThrowIfException(InvocationFuture.java:79) ... 12 more
============Update===========================
I analyzed JVM memory usage.
It's not more than 55%.2 months logs.
Then went through 2-month logs.
An application was started successfully and continuously running for 5-6 days. I observed below in the logs:
2021-11-04:-
2021-11-04 14:11:50,566 [hz._hzInstance_1_tpt-valuation_TEST02.cached.thread-73] WARN NioChannelOptions (log:46) - The configured tcp receive buffer size conflicts with the value actually being used by the socket and can lead to sub-optimal performance. Configured 1048576 bytes, actual 212992 bytes. On Linux look for kernel parameters 'net.ipv4.tcp_rmem' and 'net.core.rmem_max'.This warning will only be shown once.
2021-11-04 14:11:50,566 [hz._hzInstance_1_tpt-valuation_TEST02.cached.thread-73] WARN NioChannelOptions (log:46) - The configured tcp receive buffer size conflicts with the value actually being used by the socket and can lead to sub-optimal performance. Configured 1048576 bytes, actual 212992 bytes. On Linux look for kernel parameters 'net.ipv4.tcp_rmem' and 'net.core.rmem_max'.This warning will only be shown once.
2021-11-04 14:11:50,609 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=0 is completely lost! 2021-11-04 14:11:50,609 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=0 is completely lost!
2021-11-04 14:11:50,610 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=1 is completely lost!
2021-11-04 14:11:50,610 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=1 is completely lost!
2021-11-04 14:11:50,610 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=2 is completely lost!
2021-11-18:-
2021-11-18 19:13:20,434 [hz._hzInstance_1_tpt-valuation_TEST02.cached.thread-52] WARN NioChannelOptions (log:46) - The configured tcp receive buffer size conflicts with the value actually being used by the socket and can lead to sub-optimal performance. Configured 1048576 bytes, actual 212992 bytes. On Linux look for kernel parameters 'net.ipv4.tcp_rmem' and 'net.core.rmem_max'.This warning will only be shown once.
2021-11-18 19:13:20,434 [hz._hzInstance_1_tpt-valuation_TEST02.cached.thread-52] WARN NioChannelOptions (log:46) - The configured tcp receive buffer size conflicts with the value actually being used by the socket and can lead to sub-optimal performance. Configured 1048576 bytes, actual 212992 bytes. On Linux look for kernel parameters 'net.ipv4.tcp_rmem' and 'net.core.rmem_max'.This warning will only be shown once.
2021-11-18 19:13:20,478 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=0 is completely lost!
2021-11-18 19:13:20,478 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=0 is completely lost!
2021-11-18 19:13:20,479 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=1 is completely lost!
2021-11-18 19:13:20,479 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=1 is completely lost!
2021-11-18 19:13:20,479 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=2 is completely lost!
2021-11-18 19:13:20,479 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=2 is completely lost!
2021-11-18 19:13:20,479 [hz._hzInstance_1_tpt-valuation_TEST02.migration] WARN MigrationManager (log:51) - [10.232.104.29]:44536 [tpt-valuation_TEST02] [3.12.10] partitionId=3 is completely lost!
Only a single hazelcast node is running.
Any specific reason for partition lost when single node is running?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
