The Barcelona Supercomputing Center – Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and is a hosting member of the PRACE European distributed supercomputing infrastructure. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 650 staff from 49 countries.
Look at the BSC experience:
BSC-CNS YouTube Channel
Let’s stay connected with BSC Folks!
Context And Mission
The Text Mining Unit of the BSC-CNS (TeMU-BSC) is funded through the Plan de Impulso de las Tecnologías del Lenguaje de la Agenda Digital, by the Secretary of State of Telecommunications and the Information. It is the first publicly funded text mining unit in Spain and has the aim to promote the development of natural language processing and machine translation resources for Spanish and other co-official languages in the area of biomedicine.
We are looking for candidates with a background in computational linguistics who will participate in the creation of linguistic resources for the Catalan language and in related activities within the Text Mining Unit.
Clean, preprocess and prepare data.
Build language corpora usable by the Unit’s tool, specifically neural architectures.
Review and correct automatic NLP analyses, such as dependency parsing and POS annotation
Automatically annotate data using state-of-the-art language processing tools.
Manage corpora and language data according to the requirements specified in the Unit’s data management plan.\
Degree in Applied Linguistics, Computer Science or related disciplines with a very strong linguistic background.
Essential Knowledge and Professional Experience
Experience/knowledge in the NLP and/or MT fields.
Native speaker of Catalan.
Experience/knowledge in corpus annotation and generation of linguistic resources.
Understanding of data administration and management functions (transfer, storage, analysis, distribution, exploration, etc.).
Experience in working with large datasets and distributed file systems.
Ability to work independently and in a team to complete tasks on schedule.
Ability to work under set deadlines.