'Why does a cell with multiple PySpark "save" operations fail on Zeppelin Notebook?

I have a Zepellin (PySpark) notebook which writes two dataframes to Google Storage buckets in the end, in a single cell, which runs into the following error:

Py4JJavaError: An error occurred while calling o1190.save.
: java.lang.StackOverflowError
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:241)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.rewrite$1(QueryPlan.scala:192)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformUpWithNewOutput$1(QueryPlan.scala:193)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.rewrite$1(QueryPlan.scala:192)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformUpWithNewOutput$1(QueryPlan.scala:193)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:408)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:406)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:359)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.rewrite$1(QueryPlan.scala:192)
    

But I can get it to work if I split the writes to two different cells. Is there a limit in terms of resources available to a single cell in Zeppelin? It'd be great if there's some documentation on how the notebook utilizes available resources.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source