'Best way to Create a custom Transformer In Java spark ml

I am learning Big data using Apache spark and I want to create a custom transformer for Spark ml so that I can execute some aggregate functions or can perform other possible operation on it



Solution 1:[1]

You need to extends org.apache.spark.ml.Transformer class, this is an abstract class so you have to provide implementation of abstract methods.
As I have seen that in most of the cases we needs to provide implementation of transform(Dataset<?> dataset) method and implementation of String uid() .
Example:

public class CustomTransformer extends Transformer{

 private final String uid_;

 public CustomTransformer(){
  this(Identifiable.randomUID("Custom Transformer"));
 }

 @Override
 public String uid(){
  return uid_;
 }

@Override
public Transformer copy(ParamMap extra){
  return defaultCopy(extra);
}

@Override
public Dataset<Row> transform(Dataset<?> dataset){
 // do your work and return Dataset object
}

@Override
public StructType transformSchema(StructType schema){
  return schema;
}

}
I am also new in this so I suggest you should learn what are the uses of these abstract methods.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1