Overview

WIPO (World Intellectual Property Organization) is looking for a research and development fellow who could work on improving our WIPO Translate tool focusing on Neural Machine Translation (NMT).

We are looking for a skilled computer scientist to assist our work in data-driven Machine Translation at the World Intellectual Property Organization in Geneva (Switzerland), in order to improve the quality of machine translation using our software “WIPO Translate” based on open source tools, previously based on phrase-based statistical machine translation “Moses”, more recently on NMT tools: (Marian/Amunmt/Nematus).

Neural networks being an emerging field, we are looking for a highly motivated person, we can welcome various profiles (from a very good post master degree student up to a professor who wants to take a sabbatical year). We focus on applied research: applying the latest research development into production.
Project:
The WIPO Global Databases Division is working in the area of automatically translating intellectual property documents, mainly patents (see http://patentscope.wipo.int/translate) but also other types of documents (WIPO Translate has been used in other international organizations). WIPO is especially looking for a candidate(s) that could contribute to keep WIPO’s “advantage” in machine translation.

NMT:
– Knowledge of neural network technologies (deep learning) to experiment with the next generation of MT,
– Ability to experiment with various techniques for setting the best parameters for specific language pairs, specific domains, and for combining various input source corpora (mixing in-domain small corpus with larger out-of-domain corpus), mixing multi languages in a single NMT model etc.

MT-integration/ Natural language processing
– Work on specific tools allowing a better integration of NMT in the user environment (batch translating texts/ documents/ HTML page),
– Design Graphical User Interface (good knowledge of Java, Jquery and/or Web/JSF 2.0 required) to improve means for accessing the output of machine translation,
– Pre- and Post-process texts in different languages to improve translation quality (especially for Chinese, Korean, Japanese and German), e.g. replacing named entities by placeholders, normalizing parallel texts, etc.
– Develop methods to collect clean parallel patent sentences (e.g.: filter/clean titles and abstracts, align full texts using patent priority data, etc.),
– Work on automatic post-editing tasks: learn recurrent errors from human post editions (or from user feedback) to correct MT output,
– Define a workflow for updating NMT models using newly published documents (e.g. incremental training, explore online learning algorithms…)

Our context:
The World Intellectual Property Organization (WIPO, see http://www.wipo.int/) is a specialized agency of the United Nations. It is dedicated to developing a balanced and accessible international Intellectual Property (IP) system. As part of its mandate, WIPO translates Patent applications and disseminates information about published patent applications using the PATENTSCOPE search engine: https://patentscope.wipo.int/. To make this information available worldwide, WIPO is looking for techniques to help the translation of patents in various languages. Our “Wipo Translate” tool is publically available at: https://patentscope.wipo.int/ translate.

In WIPO Translate, we give preference to machine learning approaches; we try to create translation models learning from the data. The final goal is always to provide the best MT in production in term of quality (quality should be competitive compared to commercial tools), efficiency (we target quick translations, less than 2 seconds for one sentence) and scalability (we train our translation models on world class data). Various related publications are listed at the end of document: http://patentscope.wipo.int/translate/wtapta-user-manual-en.pdf

Company:

WIPO

Qualifications:

Required skills:
– The candidate must have a strong background in computer science (minimum masters level), ideally with specialization in machine learning techniques, and preferably familiar with computational linguistics.
– A strong knowledge of programming language is required (Java and/or Python).
– A minimum knowledge of Unix is required.

Following skills would be a plus:
– Statistics: automatic document classification approaches (SVN, Knn, EM, ANNs, Naive Bayes…)
– Web technologies: Java, Tomcat, Jquery, Jsf, AngularJs…
– Databases: nosql techniques, Mysql, Oracle…
– Search engines: Lucene / Solr / ElasticSearch
– Scripting languages: Perl, Python, bash
– Unix: Ubuntu, Red Hat, configuring Unix remote servers (using command line mode)

Language requirements:

– Applicants should have excellent written and spoken English. A working knowledge of other official languages of WIPO (German, Spanish, French, Portuguese, Russian, Arabic, Chinese, Japanese or Korean) would be an advantage.

Educational level:

Master Degree

 

Tagged as: , , ,

You can apply to this job and others using your online resume. Click the link below to submit your online resume and email your application to this employer.

About WIPO

The World Intellectual Property Organization (WIPO, see http://www.wipo.int/) is a specialized agency of the United Nations. It is dedicated to developing a balanced and accessible international Intellectual Property (IP) system. As part of its mandate, WIPO translates Patent applications and disseminates information about published patent applications using the PATENTSCOPE search engine: https://patentscope.wipo.int/. To make this information available worldwide, WIPO is looking for techniques to help the translation of patents in various languages. Our “Wipo Translate” tool is publically available at: https://patentscope.wipo.int/ translate.