Lakatu is a technology start-up that develops augmented-intelligence SaaS for the

communication and policy sectors. We use machine learning algorithms to combine data from various sources and cut through the noise of media coverage, policy publications, and research papers. With the help of this technology, we enable clients to identify relevant discussions and opinion leaders, track specific legislation, and assess the potential impact on industries and businesses.

We are currently building our Data Science team who will be responsible for building and maintaining our data ingestion and processing pipelines to ensure the flow of high-quality content into our product. Your main responsibility in this position will be to build scalable solutions that crawl, ingest, process, and enrich our content. Your team will work alongside our Engineering team to build our SaaS product.

You will join a small, friendly, yet ambitious team of software engineers, data scientists, communications, and public-policy experts who want to utilize technology to push the current and future boundaries of the media and policy sectors. We work in a fast-paced environment that requires a positive and solution-oriented attitude. We enjoy the nerdy-ness and geeky-ness of our practice but are also obsessed with adding value to our customer’s work.

#1 Essential duties and responsibilities

Working in the areas of Natural Language Processing (text mining, information extraction, information retrieval/search) and Machine Learning in media (print, digital), social, and policy data

Identifying and implementing the tools and algorithms appropriate for Natural Language Processing assignments to enhance the Natural Language Processing system currently in place

Conceptually designing, modeling, and implementing data-driven services for information retrieval and extraction, data enrichment, and linking of data

Carrying out evaluation experiments and training the developed model.




#2 Essential requirements (all)

Advanced understanding of NLP techniques and practical experience with at least one of the following: information extraction, information retrieval, topic modeling, text summarization

Strong experience with Python, Machine Learning (PyTorch, Tensorflow, Keras, scikit-learn), and NLP (SpaCy, Huggingface, NLTK, gensim, etc.) frameworks/libraries

Master’s degree in a relevant field or (provable) similar experience

#3 Nice to have (one or more)

Experience with search engine software libraries (Lucene, Solr, ElasticSearch, Jina.AI, Haystack, etc.)

Experience with language generation and/or knowledge graphs in NLP applications

Basic understanding of data structures, algorithms, and their complexities

Publications in respected NLP/ML venues

Ability to code and design software architectures

Experience with developing web services (Tornado, Flask, FastAPI, etc.)

Experience with git

Language requirements:

Strong written and verbal skills in English

Educational level:

Master Degree

Tagged as: , , ,