Overview

Our customer is working on an advanced knowledge management system designed specifically for organizations with large and diverse knowledge bases spread across multiple sources of information. This system is capable of handling structured, unstructured, and semi-structured data from different forms (textual, graphical, video, and voice). This approach results in specific challenges, one of them is text segmentation to be able deal with the more kinds of textual data as regulatory, scientific and technical documentation.
As a researcher, you will collaborate with the development team composed of backend, frontend developers and UI/UX experts to take in charge the text segmentation domain.

YOUR ROLE
Working directly with lead developers and key stakeholders of the company, you will be in charge of the mastering of the text segmentation part of the LLM,meaning :

Collaborate with stakeholders (co-founders, software developers, power users) to understand the specific needs for text segmentation.
Choose appropriate algorithms for different sort of text segmentation (readability, data analysis, or feeding into further NLP processes). This might involve rule-based methods, machine learning models, or a combination of both.
Develop or adapt existing algorithms for the specific nature of the text being processed (e.g., technical documents, legal documents etc.).

Company:

Science me up

Qualifications:

You hold a PhD degree in Computer Science, Mathematics or a related field with with thesis or post-doc experience and skills below :
A strong foundation in machine learning, natural language processing (NLP) and text analytics is essential.
Experience with utilizing and fine-tuning large language models (LLM).
Proficiency in Python and data science/machine learning frameworks (like PyTorch/Tensorflow, Transformers) is crucial.
Hands-on experience with NLP techniques and text analytics including text normalization, tokenization, embedding generation.
In-depth knowledge of data mining techniques, especially those relevant to pattern recognition, anomaly detection, and clustering in large datasets.

Educational level:

Master Degree

Level of experience (years):

Mid Career (2+ years of experience)

Tagged as: , , , , , ,