Dense captioning task both localizes and describes salient(most outstanding and important) regions in images in natural language. The dense captioning task generalizes object detection when the descriptions consist of a single word, and Image Captioning when one predicted region covers the full image.
Above is an example of dense captioning where each bounding box will correspond to a description (not just a class in classical object detection).
Không có nhận xét nào:
Đăng nhận xét