I'm trying to use InceptionV3 as a pre-trained feature extractor for shorthand writing image captioning, am I on the right track?
I'm new to AI and I'm trying to understand image captioning for a project that I'm working on. I'm trying to build a translator for Gregg shorthand writing. I'd be feeding pictures of individual Gregg words to the model and make it guess what letters are present in the image.
So far I understand that I need a feature extractor, and InceptionV3 is looking good as an option but I wan't to make sure that I'm doing it right.
Would it matter if I used ImageNet weights for the InceptionV3 feature extractor or is there a specific feature extractor model for printed and handwritten text? I'd appreciate any help I get, so thanks!
Topic inception machine-learning
Category Data Science