Stacked Attention Network for Visual Question Answering

Mentor: Dr. Vinay P. Namboodiri, IITK

Stretch: Aug’17 - Nov’17

Complete Project: Stacked Attention Network

Modified SANs for VQA using Visual Grounding of Phrases. SANs use semantic representation of a question as query to search for the regions in an image that are related to the answer. Implemented a multiple-layer SAN in which we query an image multiple times to infer the answer progressively.

machine-learning