Trim left tail of music in audio file

Question

Trim left tail of music in audio file

David Harar

2021年12月29日 23:39

I have audio files, most of them start with the same music, and then a conversation begins. I want to trim the part of the music (which can be varied in length). I have no labels, I can transcribe the whole file using off-the-shelf models, but the music itself contains words which are resulted in false positives. but I know to extract features from the audio, such as Mel spectrogram, pitch, etc. The music at the beginning of the file can easily be noticed by looking at the spectrogram or just at the sound wave (please see the following images).

I thought about using a knn with a high number of neighbors, and then filtering the audio based on its values. Is there a more obvious way?

Thanks!

Topic k-nn audio-recognition unsupervised-learning

Category Data Science

David Harar · Accepted Answer · 2021年12月29日 23:39

Eventually, since the data includes only phone calls, I have noticed that there is a "BIP" that separates the conversation from the music at the beginning. So I convolved it over files and achieved better results than k-means and GMMs.

Trim left tail of music in audio file

About