ActivityNet is a new large-scale video benchmark for human activity understanding. ActivityNet aims at covering a wide range of complex human activities that are of interest to people in their daily living. In its current version, ActivityNet provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours.
We heavily rely on the crowd and specifically, Amazon Mechanical Turk, to help acquire and annotate ActivityNet. Our acquisition pipeline has three main steps: Collection, Filtering, and Temporal Localization.
See our ICMR 2014 paper for detailed information. Fabian Caba Heilbron and Juan Carlos Niebles. Collecting and Annotating Human Activities in Web Videos. In ICMR, 2014.