This newly collected dataset contains over 8000 hours of video data from YouTube and Flicker, annotated into 500 categories. The categories cover a wide range of popular topics like social events (e.g., “tailgate party”), procedural events (e.g., “making cake”), objects (e.g., “panda”), scenes (e.g., “beach”), etc. Compared with FCVID, new categories are added to enrich the original hierarchy. For example, 76 new categories are added to "cooking" totaling 93 classes, and 75 new classes are added to "sports". During annotation, multiple labels have been considered as much as possible for each video. When labeling a particular category, categories that are not likely to co-occur are filtered out manually with the remaining labels considered for annotation.