Audio Classification with Counter

I'm trying to create a model that can identify one particular sound, and every time it hears that sound, it increases a counter by 1. So for example, if it hears a specific bird chirping ten times, the counter should display the number 10.

I'm looking for a bit of guidance here as to how to go about this. I know that I will need to use audio classification and for my data, I only have .wav files of that one particular sound since I recorded them with my iPhone. Currently, I'm just preprocessing the data and doing some EDA with librosa in Python.

Here's my questions:

1. Is there a way to train a model to only identify one particular sound based on the data I have? (and if it hears anything different, it's classified as 0/no)

2. How would I go about combining the audio classification model (given that it works) with a counter that can capture a real time sound, process it and display a number if its that particular sound?

I don't really know where to start when I think about these questions. Any help regarding what libraries I may need, terms I should research or know regarding sound/audio data or just your own personal opinion on how you would tackle this problem would be much appreciated.

Topic audio-recognition deep-learning neural-network classification python

Category Data Science


The kind of sound you are describing, that have a well defined duration and can be counted, is called a sound event. The task of detecting such is called Sound Event Detection (SED). Sometimes one also sees it called Audio Event Detection or Acoustic Event Detection (AED).

There are some resources for learning about it online, for example:

The key to doing this in real-time, is to process the audio as many short, fixed length analysis windows. Say that your model takes 1 second of audio as input, then you would every 0.1 second (for example) give it the last 1 second of audio, make a classification, and process that.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.