Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Hello I’m trying to learn CNNs and I’ve hit a deadend with an Image Captioning project I was working on for fun.
Dataset: 10k images from Google Conceptual Captions
Tutorial I’m mostly following: Automatic Image Captioning
One difference between my dataset and the Flicker8k dataset in the tutorial is that my dataset only has one caption per image but latter has five captions per image.
The problem is that I am getting the same caption for nearly all images. I have tried to use: – LSTM instead of GRU cells – 50 and 200 Glove word embeddings. I even tried to create my own embeddings using all captions in the dataset – beam search and greedy search to get a prediction
What do I do?
submitted by /u/plmlp1
[link] [comments]