'flink cluster with zookeeper HA always shutdown: [RECEIVED SIGNAL 15: SIGTERM]
Environment:
flink1.14.4 standalone application mode in kubernetes
according to official steps:
flink cluster: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/standalone/kubernetes/#application-mode
zookeeper HA: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/ha/zookeeper_ha/
The problem:
the jobmanager always shutdown and restart every three minutes then quit
-- no timer task and the program logic just a easy wordcount
-- when the cluster running no any input or nothing to do also have this problem every three minutes
-- if jobmanager non zookeeper HA don't have this problem
The question:
why the jobmanager always shutdown with the zookeeper HA and how to solve it
used the same steps and yaml from official site, so no idea for this problem
The code:
just a wordcound and other program also the problem
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<String> dataStreamSource = executionEnvironment.socketTextStream(HOST, PORT);
DataStream<Tuple2<String, Integer>> sum = dataStreamSource.flatMap(new WordCount.MyFlatMapper()).keyBy(0).sum(1);
sum.print();
executionEnvironment.execute();
}
Jobmanager pod resatrt and quit:
NAMESPACE NAME READY STATUS RESTARTS AGE
default flink-jobmanager-8jn6x 1/1 Running 1 (118s ago) 5m38s
default flink-jobmanager-8jn6x 1/1 Running 2 (106s ago) 8m26s
default flink-jobmanager-8jn6x 1/1 Running 3 (1s ago) 9m41s
default flink-jobmanager-8jn6x 1/1 Running 4 (1s ago) 12m
default flink-jobmanager-8jn6x 1/1 Running 5 (0s ago) 15m
default flink-jobmanager-8jn6x 1/1 Running 6 (1s ago) 18m
default flink-jobmanager-8jn6x 1/1 Terminating 6 (1s ago) 18m
default flink-jobmanager-8jn6x 1/1 Terminating 6 (1s ago) 18m
Jobmanager logs:
-1--
2022-04-23 09:48:21,970 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 33 (type=CHECKPOINT) @ 1650707301963 for job 00000000000000000000000000000000.
2022-04-23 09:48:22,010 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 33 for job 00000000000000000000000000000000 (4917 bytes, checkpointDuration=23 ms, finalizationTime=24 ms).
2022-04-23 09:48:26,627 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2022-04-23 09:48:26,795 WARN akka.actor.CoordinatedShutdown [] - Could not addJvmShutdownHook, due to: Shutdown in progress
2022-04-23 09:48:26,822 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:48:26,824 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:48:26,824 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
2022-04-23 09:48:26,838 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
2022-04-23 09:48:26,887 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remoting shut down.
2022-04-23 09:48:26,894 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remoting shut down.
---
-2--
2022-04-23 09:51:24,903 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 67 (type=CHECKPOINT) @ 1650707484897 for job 00000000000000000000000000000000.
2022-04-23 09:51:24,943 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 67 for job 00000000000000000000000000000000 (4982 bytes, checkpointDuration=21 ms, finalizationTime=25 ms).
2022-04-23 09:51:26,626 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2022-04-23 09:51:26,840 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:51:26,845 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:51:26,847 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
2022-04-23 09:51:26,848 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
2022-04-23 09:51:26,871 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remoting shut down.
---
-3--
2022-04-23 09:54:26,625 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2022-04-23 09:54:26,838 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:54:26,840 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
[root@master 02-logger--ckps-nfs-reactive-hpa-zk]#
---
-4--
2022-04-23 09:57:26,627 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2022-04-23 09:57:26,632 INFO org.apache.flink.runtime.blob.BlobServer [] - Stopped BLOB server at 0.0.0.0:6124
2022-04-23 09:57:26,812 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 09:57:26,812 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
---
-5--
2022-04-23 10:00:26,625 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2022-04-23 10:00:26,859 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Shutting down remote daemon.
2022-04-23 10:00:26,859 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator [] - Remote daemon shut down; proceeding with flushing remote transports.
2022-04-23 10:00:26,884 WARN akka.actor.CoordinatedShutdown [] - Could not addJvmShutdownHook, due to: Shutdown in progress
---
-- updated 2022/04/30 --
Debug logs: https://www.mediafire.com/file/3q8vpzqfnmohgng/debug.log/file
thx all!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|