'Does spark.sql.adaptive.enabled work for Spark Structured Streaming?

I work with Apache Spark Structured Streaming. Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. Since It builds on the Spark SQL engine, does it mean spark.sql.adaptive.enabled works for Spark Structured Streaming?



Solution 1:[1]

It's disabled in Spark code - See in StreamExecution:

// Adaptive execution can change num shuffle partitions, disallow
sparkSessionForStream.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false")

The reason for that is because it might cause issues when having state on the stream (more details in the ticket that added this restriction - SPARK-19873 ).

If you still want to enable it for the Spark Structured Streaming (e.g. if you are sure that it won't cause any harm in your use case), you can do that inside the foreachBatch method, by setting batchDF.sparkSession.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "true") - this will override the Spark code which disabled it.

Solution 2:[2]

No. Stated in this doc https://docs.databricks.com/spark/latest/spark-sql/aqe.html. Non-streaming not supported, does not apply for AQE.

Think of statefulness, small datasets ideally. Many restrictions in Spark Structured Streaming.

Solution 3:[3]

Here:

void consume_ptr( std::shared_ptr<int>&& ptr )
{
    std::shared_ptr<int> new_ptr { ptr }; // calling copy ctor

    std::cout << "Consumed " << ( void* ) new_ptr.get( ) << '\n';
}

No move happens there. No call to a move ctor or a move assignment operator. That's just a simple pass by rvalue reference there.

Output:

Consumed 0x20d77ebe6a0
ptr should be moved?

Now take a look at this:

void consume_ptr( std::shared_ptr<int>&& ptr )
{
    std::shared_ptr<int> new_ptr { std::move(ptr) }; // calling move ctor

    std::cout << "Consumed " << ( void* ) new_ptr.get( ) << '\n';
}

int main( )
{
    std::shared_ptr<int> ptr { std::make_shared<int>( ) };
    consume_ptr( std::move(ptr) );

    if ( ptr )
    {
        std::cout << "ptr should be moved?" << std::endl;
    }
}

Output:

Consumed 0x21160bbdf40

See. The move operation happened. And the if's body did not run.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 yishaiz
Solution 2
Solution 3 digito_evo