This demo presents the various steps involved in indexing large video databases. First, very compact visual DNAs (vDNA) is extracted from video frames. These serve as input to various downstream analysis tasks, including video shot detection, frame- or shot-based tagging, and (customised) face feature analysis. A particular emphasis of the demo is on the customisability of the individual features and how, with only a handful of sample images depicting a certain concept, it is possible to train a custom model that recognises the concept in unseen videos. Another key feature is custom facial expression analysis, which facilitates recognition of facial expressions with as little as just one sample face image depicting the facial expression. A key element of the product is that the vDNAs only need to be extracted once, and do not need to be recomputed if new (custom) concepts or faces are added. This allows the system to process as many as 30,000 vDNAs per second, making the maintenance and model updates a breeze.
Speaker:
Dominic Rüfenacht, Science Team Lead - Mobius Labs
No comments yet