“The research project focuses on personalised, multimodal and interactive information access. Current developments in fields like language processing, computer vision, information retrieval, and natural mobile communications offer a rich toolbox for multimodal querying of information. In our case, querying of information is done by jointly processing language utterances and visual input, e.g., taking a picture of an object of interest and asking questions in natural language about the object and its attributes. The focus is on building suitable multimodal query representations that are used to search a document collection.
The goal of the PhD is to study and design models for building multimodal representations that effectively capture the content of the visual and accompanying natural language query by relying on current advanced machine learning techniques for representation learning.”
The ideal candidate has completed or is about to complete a master in computer science or a similar discipline, and is acquainted with information retrieval, multimedia processing and statistical machine learning. The candidate has a strong interest in natural language processing, computer vision, multimodal representation learning and the integration of multimodal representations in retrieval models. Outstanding results in prior studies are required.
The candidate is fluent in spoken and written English.