'Optimal design patterns for reading and restructurize datasets with different formats?
Problems:
(I want to solve this) Homogenize image datasets with different formats: HDF5, folder images (with different structures), etc.
(Just to give you context) Then, the datasets are concatenated, preprocessed by a client code and stored in a HDF5 file with a defined/fixed structure.
My solution to 1:
Use the template pattern as the following pseudo-UML shows:
Noticed drawbacks of this solution to 1:
- Client code needs to be changed each time a new dataset comes into play because it doesn't know which ConcreteStructurizer should use for a given dataset, I mean, the client does something like that:
if datset_0 use ConcreteStructurizerFolder
ConcreteStructurizerFolder(cfg_dataset_0).reorganize()
.
.
.
if dataset_n use ConcreteStructurizerHDF5
ConcreteStructurizerHDF5(cfg_dataset_n).reorganize()
Could you propose a better/optimal approach/design pattern?
PD: I am learning software design (physics background), I'd be grateful if you could provide a pedagogical/well explained answer, thanks.
Solution 1:[1]
You can adopt chain-of-responsibility pattern to your design.
You have different generalization of Structurizer which is a good start. Your design defined what a Structurizer can perform on a certain type of dataset. Let make it more powerful by adding a logic to define the types of dataset that it can handle. Then connect all the Structurizer as a chain.
We pass a dataset to the chain, if the first Structurizer can handle it, it process it. If it cannot, it pass the dataset to next piece to handle. So on and so forth.
My example introduce a new interface DatasetHandler which add the behavior to
- set its next piece.
setNextHandler(DatasetHander) - define the types of dataset that it can handle.
handle(Dataset). The boolean return indicate a dataset is successfully handled or not.
I am not good at PHP, but I think design pattern applies to any OO language. Below example are Java.
Interfaces
public interface Structurizer {
public void reorganize();
public void createDest();
public void makeMetaDataJson();
public void moveToFiles();
}
public interface DatasetHandler {
public void setNextHandler(DatasetHandler handler);
public boolean handle(Object dataset);
}
Structurizer implementation
public abstract class AbstractStructurizer implements DatasetHandler, Structurizer {
private String datsetPath;
private String destPath;
private boolean overwrite;
private String metadataFn;
private boolean possiblyOtherAttributes;
private DatasetHandler nextHandler;
@Override
public void setNextHandler(DatasetHandler handler) {
if (getNextHandler() == null) {
this.nextHandler = handler;
} else {
this.nextHandler.setNextHandler(handler);
}
}
public boolean tryNextHandler(Object dataset) {
if (getNextHandler() == null) {
return false;
} else {
return getNextHandler().handle(dataset);
}
}
public String getDatsetPath() {
return datsetPath;
}
public void setDatsetPath(String datsetPath) {
this.datsetPath = datsetPath;
}
public String getDestPath() {
return destPath;
}
public void setDestPath(String destPath) {
this.destPath = destPath;
}
public boolean isOverwrite() {
return overwrite;
}
public void setOverwrite(boolean overwrite) {
this.overwrite = overwrite;
}
public String getMetadataFn() {
return metadataFn;
}
public void setMetadataFn(String metadataFn) {
this.metadataFn = metadataFn;
}
public boolean isPossiblyOtherAttributes() {
return possiblyOtherAttributes;
}
public void setPossiblyOtherAttributes(boolean possiblyOtherAttributes) {
this.possiblyOtherAttributes = possiblyOtherAttributes;
}
public DatasetHandler getNextHandler() {
return nextHandler;
}
}
import java.io.File;
public class ConcreteStructurizerFolder extends AbstractStructurizer {
@Override
public void reorganize() {
System.out.println("reorganizing folder dataset...");
}
@Override
public void createDest() {
System.out.println("creating folder destination...");
}
@Override
public void makeMetaDataJson() {
System.out.println("making folder metadata json...");
}
@Override
public void moveToFiles() {
System.out.println("moving folders...");
}
@Override
public boolean handle(Object dataset) {
if (dataset instanceof File) {
File fileData = (File) dataset;
if (fileData.isDirectory()) {
reorganize();
createDest();
makeMetaDataJson();
moveToFiles();
return true;
} else {
return tryNextHandler(dataset);
}
} else {
return tryNextHandler(dataset);
}
}
}
import java.io.File;
public class ConcreteStructurizerHDF5 extends AbstractStructurizer {
@Override
public boolean handle(Object dataset) {
if (dataset instanceof File) {
File datafile = (File) dataset;
if (datafile.getName().toLowerCase().endsWith("hdf5")) {
reorganize();
createDest();
makeMetaDataJson();
moveToFiles();
return true;
} else {
return tryNextHandler(dataset);
}
} else {
return tryNextHandler(dataset);
}
}
@Override
public void reorganize() {
System.out.println("reorganizing HDF5 dataset...");
}
@Override
public void createDest() {
System.out.println("creating HDF5 destination...");
}
@Override
public void makeMetaDataJson() {
System.out.println("making HDF5 metadata json...");
}
@Override
public void moveToFiles() {
System.out.println("moving HDF5 files...");
}
}
public class ConcreteStructurizerUnknown extends AbstractStructurizer {
@Override
public boolean handle(Object dataset) {
System.out.println(String.format("unknown dataset :%s", dataset.getClass()));
return false;
}
@Override
public void reorganize() {
}
@Override
public void createDest() {
}
@Override
public void makeMetaDataJson() {
}
@Override
public void moveToFiles() {
}
}
Client
import java.io.File;
public class Client {
public static void main(String[] args) {
// prepare handler chain
DatasetHandler handlerChain = new ConcreteStructurizerFolder();
handlerChain.setNextHandler(new ConcreteStructurizerHDF5());
handlerChain.setNextHandler(new ConcreteStructurizerUnknown());
// let handler chain handle different type of dataset
System.out.println("==== test HDF5 dataset ====");
handlerChain.handle(new File("dataset.HDF5"));
System.out.println("==== test txt dataset ====");
handlerChain.handle(new File("Untitled.txt"));
System.out.println("==== test folder dataset ====");
handlerChain.handle(new File("C:\\"));
System.out.println("==== test unknown type dataset ====");
handlerChain.handle("this is an unknown type");
}
}
Output
==== test HDF5 dataset ====
reorganizing HDF5 dataset...
creating HDF5 destination...
making HDF5 metadata json...
moving HDF5 files...
==== test txt dataset ====
unknown dataset :class java.io.File
==== test folder dataset ====
reorganizing folder dataset...
creating folder destination...
making folder metadata json...
moving folders...
==== test unknown type dataset ====
unknown dataset :class java.lang.String
Basically what we are doing here is, we break down the n-th if case. For each if condition, we encapsulate it inside the DatasetHandler. If new dataset type is added in the future, you implement a new Structurizer for that type and add it to the chain. The client will be much more manageable without the long running if.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

