How should I process the output from this neural network?

Question

How should I process the output from this neural network?

user135409

2022年5月6日 06:12

I have a neural network that takes an np.array of a mel spectrogram of a 3 second audio clip from a song as input, and outputs vector of individual predictions that it is from 494 given (individual) artists.

At first, I was getting whole songs, splitting them into 3 second clips, inputting each clip into the nn, and averaging the outputs. But this proved to be wonky.

I got advice that I should only need one 3 second clip, but this person had not worked in audio before. If I should do that, which 3 seecond clip should I get? For many songs, the first or last 3 seconds is silence, or does not sound like the song at all. For artist classification, that can get wonky.

What do you all advise?

Topic audio-recognition neural-network

Category Data Science

How should I process the output from this neural network?

About