At Dialpad, we’re a team of do-ers. A team that thinks outside the box and when that doesn’t work, we reinvent it. We don’t settle for the status quo and neither do the things we build. Led by the same minds behind Google Voice, we build products that get businesses talking—whether it’s across the hall, street, or country.
With $70 million in funding from Google Ventures, Andreessen Horowitz, and other top VC’s along with engineers from companies like Microsoft and Google, every member of our team plays an essential role in creating a voice product that doesn’t just combine design and mobility but works with you wherever productivity may strike.
Design and own strategies and pipelines for acquiring high quality training data. Optimize the quality, latency and cost of data acquired by crowdsourcing data labelling or internal labellers.
Manage large quantities of text and audio data. Typical tasks include extracting samples from databases, writing scripts to trim and clean data, and making datasets available on cloud services.
Developing standards for text data. Typical tasks include creating processes to infer pronunciations for words, that spellings and capitalizations are consistent across data, and standardizing incoming data from human transcribers.
Managing human labellers. Typical tasks include writing instructions for labellers, directing data to the interface that labellers will use, and creating tests to ensure quality.
Interact with world-class speech recognition and NLP specialists to help them meet their model’s needs for labelled data.
Masters or Ph.D. degree in technical or linguistic field required
5+ years’ experience in data management
5+ years’ experience in text processing
5+ years using labelled data, in a machine learning context for example
3+ years experience with labelling data using crowdsourcing
Excellent attention to detail
Creative, resourceful problem solver
Excellent data management skills with various platforms and languages
Comfortable using Python for data cleaning and management
Shell scripting skills
Proven ability to handle big data
Fluency in English and excellent understanding of the English language from a phonetic, grammatical, and linguistic perspective
Some experience with machine learning
Bonus: Multiple spoken languages (particularly Spanish and Japanese)
Bonus: Advanced programming skills in other programming languages
Bonus: Data presentation and analysis skills
Level of experience (years):
Senior (5+ years of experience)
How to apply:
Please mention NLP People as a source when applying