Tags

image captioning
image retrieval
multimodal dataset
real-world events
Contextual Captioning
Event Extraction
Real-World Semantics
VisChronos framework