'spark3.2 the first task deserialization time too large

version: spark structured streaming 3.2.0 environment: yarn, 1 driver, 2 executors

stage img

task img

this is a spark structured streaming program, source is kafka, a group by operator and sink to mongo

i dont konw why the fist task of the first stage deserialization time is too large and it running about few min

then, job recover running normarlly after the batch finish



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source