Is there wights of voice or audio for VGG or Inception?
- I want to use
VGG16
(orVGG19
) for voice clustering task. - I read some articles which suggest to use
VGG
(16 or 19) in order to build the embedding vector for the clustering algorithm. - The process is to convert the wav file into
mfcc
orplot (Amp vs Time)
and use this as input toVGG
model. - I tried it out with
VGG19
(andweights='imagenet'
). - I got bad results, and I assumed it because I'm using
VGG
with wrong weights (weights of images (imagenet
))
So:
- Are there any audio/voice per-trained weights for VGG ?
- If not, are there other per-trained audio /voice models ?
Topic vgg16 transfer-learning inception feature-engineering deep-learning
Category Data Science