[D] Need help with building speech dataset
I have a long voice recording with a lot of silence and decoding with text, speaker id (where is more than one speaker) and timings.
Is there some pre-build library to split this file into pairs of phrases and texts?
submitted by /u/hadaev
[link] [comments]