In the framework of the ERA-NET Atlantis project, LATTICE offers a one year post-doctoral position aiming at developing original experiments on grounded language learning.
A major challenge for natural language processing is connecting language with the outer world. What is done naturally and intuitionally by children remains a challenge for artificial systems. For example, autonomous agents must be able to learn how to refer to objects, and even execute specific tasks like taking or moving objects in a specific environment (Yu and Siskind, 2013), but this is still difficult to do in practice. Experiments are done in simulated artificial worlds, of reduced complexity, and it is not always clear how these experiments can scale up, or what the agents has to know (initial, hand-crafted knowledge) to be able to learn something from / about the environment. This domain has given birth to interesting experiments recently, some of them reproducing to some extent phenomena observed in infant language learning (Hill et al. 2017).
The post-doctoral candidate is expected to develop original experiments in this direction. We are especially interested in experiments mimicking the acquisition of an initial vocabulary (vocabulary growth and lexical burst) as well as simple constructions (combination of two or several words). Applicants are free to suggest the kind of experiments and environments they would like to use or develop, but it is advisable to use existing, and open source datasets as far as possible so as to allow comparison with other works. Please note that the goal is to learn from an artificial setting, not to learn directly from child language data.
– H. Yu and J.M. Siskind, 2013. Grounded Language Learning from Video Described with Sentences, ACL, pp. 56?63.
– F. Hill, K.M. Hermann, P. Blunsom & S. Clark, 2017. Understanding grounded language learning agents. https://arxiv.org/pdf/1710.09867.pdf
Strong programming experience
A good knowledge of machine learning and natural language processing
A PhD on a related topic would be of course a strong plus
Redefining laboratory data for Next generation biotechnology