'Do Spark encoders respect Java's rules of inheritance?

My understanding: If I have a model class that extends a second model class, I shouldn't be able to access the private members of the parent class in the child class (unless I use reflection).

Extending this, I expect that when a Spark dataframe is encoded as a dataset of the child model class, it shouldn't have columns that include private members of the parent model class. (But this is not what I observe.)

More concretely, my parent class:

public class Foo {
    private int one;
    protected String two;
    protected double three;
}

The child class:

public class Bar extends Foo {
    private int four;
    protected String five;
}

I have a couple of Bar objects that I use to create a Spark dataframe i.e., Dataset<Row> like so:

Dataset<Row> barDF = session.createDataframe(barList, Bar.class);

When, at a later point, I want to encode this as a dataset,

Dataset<Bar> barDS = barDF.as(Encoders.bean(Bar.class));

I expect barDS to have four columns (excluding one, the private member of Foo). But the result of barDS.show() is instead:

+------+------+-----+-------+-----+
| five | four | one | three | two |
+------+------+-----+-------+-----+
| 9    | 9    | 0   | 3.0   | 3   |
| 16   | 16   | 0   | 4.0   | 4   |
+------+------+-----+-------+-----+

What am I missing in expecting one not to be present in the dataset? Also, what encoding can I use instead of bean encoding so that Java's rules of inheritance are obeyed?



Solution 1:[1]

We can follow the below steps to clone the pipeline by customizing it according to our requirements:

  1. Edit the pipeline.
  2. Copy the YAML format and can be used for new pipeline.
  3. Customize the new pipeline accordingly.

Refer to this MS Doc to understand about customizing a pipeline. And doc for clone pipeline

Solution 2:[2]

I am afraid that you could not copy the pipelines with all individual builds.

Azure DevOps now only supports to copy pipelines from one project to another, you can follow the steps in this doc: Clone or import a pipeline

I can fully understand your requirement. I suggest that you can create a suggestion ticket in the this site.

Thanks for your understanding.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SaiKarri-MT
Solution 2 Kevin Lu-MSFT