'How to merge data (CSV) files from multiple branches (Git and DVC)?

Background: In my projects I'm using GIT and DVC to keep track of versions:

  • GIT - only for source codes
  • DVC - for dataset, model objects and outputs

I'm testing different approaches in separate branches, i.e:

  • random_forest
  • neural_network_1
  • ...

Typically as an output I'm keeping predictions in csv file with standarised name (i.e.: pred_test.csv). As a consequence in different branches I've different pred_test.csv files. The structure of the file is very simple, it contains two columns:

  • ID
  • Prediction

Question: What is the best way to merge those prediction files into single big file?

I would like to obtain a file with structure:

  • ID
  • Prediction_random_forest
  • Prediction_neural_network_1
  • Prediction_...

My main issue is how to access files with predictions which are in different branches?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source