Sort:  

mcsvi, Oh, yeah, I meant this work of Li Fei-Fei and Andrej Karpathy "DenseCap: Fully Convolutional Localization Networks for Dense Captioning". Thank you for the great link!