Dear list, [Sorry for cross-posting] We have created and made publicly available a dense audio-visual person-oriented ground-truth annotation of a feature movie (100 minutes long): “Hannah and her sisters” by Woody Allen. The annotation includes • Face tracks in video (densely annotated, i.e., in each frame, and person-labeled) • Speech segments in audio (person-labeled) • Shot boundaries in video The annotation can be useful for evaluating • Person-oriented video-based tasks (e.g., face tracking, automatic character naming, etc.) • Person-oriented audio-based tasks (e.g., speaker diarization or recognition) • Person-oriented multimodal-based tasks (e.g., audio-visual character naming) Detail on Hannah dataset and access to it can be obtained there: https://research.technicolor.com/rennes/hannah-home/ https://research.technicolor.com/rennes/hannah-download/ Acknowledgments: This work is supported by AXES EU project: http://www.axes-project.eu/ Best regards, Alexey Ozerov, Jean-Ronan Vigouroux, Louis Chevallier and Patrick Pérez Alexey Ozerov |