Are you a Data Scientist who is keen on using Natural Language Processing (NLP) in products that support thousands of researchers? Are you familiar with large data processing frameworks like Apache Spark, and are you eager to work with large high-quality textual datasets? We have the right opportunity for you!


Research Products is looking for a Data Scientist to work collaboratively on text processing and classification capabilities. You will have the opportunity to support bibliometric-oriented projects (analytics and classification) and work on existing and new products throughout Elsevier with NLP driven product features for customer facing products such as Pure (indexing and classification), SciVal (discovery), and Funding (classification).


You will be working in the NLP team in Research Products, which is responsible for text processing functionalities such as concept mining (indexing) and text classification for scientific article text. Many of such functionalities are delivered through the Elsevier Fingerprint Engine (FPE), a text mining toolkit with capabilities such as tagging, classification, discovery and reporting. The new position would involve development of NLP-based functionality, support to other teams and products that rely on the text analysis capabilities, and data science activities in which the FPE – combined with the vast Elsevier data assets – play an integral role. These activities include the improvement and development of semantic based matching algorithms and text classifiers.

The data scientist is expected to have the methodological skills and experience to employ various NLP techniques and to organize and wrangle large datasets from different sources for the purpose of information extraction, classification, matching/ranking, and linking. The data scientist is a member of the NLP team and works in cooperation with other teams in Research Products, such as the SciVal Content, Funding, Data, and Analytics teams.




What you should bring

MSc degree in Computer Science, Data Science, NLP or Computational Linguistics
2+ years of industry experience, involving NLP-focused (software) development
A proactive team player that can tackle most task independently
Solid and good data science skills, shown by experience and delivered projects
Experience with common NLP techniques and tooling, and machine learning
Experience with handling large textual datasets and big data platforms such as Apache Spark.
Familiarity with relational databases (SQL)
Demonstrable experience in programming, ideally with Java, Scala, or C#/++
Familiarity with .Net and/or Asp.NET MVC would be welcome
Experience working in a software development team

About Elsevier

A leading provider of science and health information, Elsevier partners with experts around the globe to develop world-class content, delivering it in ways that fuel discovery, drive innovation and improve health care. Our global community comprises over 7,000 journal editors, 70,000 editorial board members, 300,000 reviewers and 600,000 authors. They are scientists and clinicians; authors and editors, professors and students; information professionals and decision makers.

We are a global company headquartered in Amsterdam, employing more than 7,000 people in 24 countries. Elsevier's roots are in journal and book publishing, where we have fostered the peer-review process for more than 130 years. Today we are driving innovation by delivering authoritative content with cutting-edge technology, allowing our customers to find the answers they need quickly.