'How to Train Wav2vec2 XLSR With local Custom Dataset

I want to train a speech to text model with wav2vec2 xlsr (transformer-based model) in danish language, as a recommendation, many people train their model using common voice with the help of datasets library, but in common voice, there is very less amount of data for danish, now I want to train the model with my own custom data, but I am failed to find any clear documentation for this, can anybody please help me with this, that how can I do it step by step?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source