PhD in Informatics Seminar #7 2020/2021 | DI Ciências ULisboa

Title: Dynamic Ensemble of Content-based Image Retrieval Systems using Online Learning with Expert Advice

Speaker: Soraia M. Alarcão, LASIGE – DI/FCUL

When: May 20 (Thursday) at 12:00


Content-Based Image Retrieval (CBIR) systems are useful to store and efficiently retrieve similar images from large collections. The main challenge in CBIR systems is to choose features that are sufficiently discriminative to represent the images under comparison while keeping them compact to ensure that the system is timely and computationally efficient. Since human perception of image similarity is subjective, semantic and task-dependent, unveiling the perfect combination of discriminative features is highly domain-specific and dependent on the type of image.
The most common approach is to combine multiple descriptors using an early fusion approach, where all descriptors are assumed to have the same importance, but they may not yield the same results for different types of images. An alternative is to use weights to early-fuse the descriptors using genetic algorithms or weighted functions. With both approaches, the process of designing a new CBIR for new datasets or domains involves a huge experimentation overhead, leading to multiple fine-tuned CBIR systems.
To overcome this experimentation effort, we propose a metaCBIR solution, a novel application of online learning with expert advice to create an ensemble of CBIR systems that allows us to dynamically converge to the best combination of systems, by taking advantage of user’s feedback. The resulting ensemble will be less dataset and domain dependent, while being able to take advantage of the experiments already performed to create the individual CBIR systems. Our solution is designed to be model-agnostic, modular, and scalable. Each CBIR in the ensemble returns its set of most similar images, that will be further combined in a dynamic fashion using weights to produce the final set of images similar to the given query image. The weight of each CBIR is updated based on the quality of its results, assessed by one or more human evaluators.
We conducted experiments on 13 benchmark datasets from the Biomedical, Real, and Sketch domains. metaCBIR was able to select the best CBIR sets across domains quickly (usually, less than 25 queries need to receive human feedback).