Overview

Machine Learning Evaluation Specialist (Remote)

List of accepted countries and locations

Important for US applicants: This is a 1099 independent contractor role and is not compatible with F-1 OPT, STEM OPT, or other visa statuses that require W-2 employment, guaranteed hours, or employer sponsorship. We are unable to provide offer letters or employment verification for this role.

Help design the hardest ML problems state-of-the-art AI hasn’t solved yet.

We’re hiring domain experts to build evaluation tasks that challenge the frontier of AI. This is not an ML engineering role – it’s a research role. You’ll use deep expertise in your field to create problems that general ML knowledge can’t touch.

What you’ll do

Propose and frame original, research-grade ML problems rooted in your domain

Design evaluation tasks that require specialized knowledge well beyond standard pipelines

Assess AI-generated solutions for correctness, creativity, and methodological rigor – and explain exactly where and why they fall short

Document problem difficulty, required domain knowledge, and expected failure modes

What you need

Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with ML

Strong working knowledge of ML methods – model selection, feature engineering, evaluation metrics

Deep familiarity with active research problems in your field – you know where general ML knowledge runs out

Excellent written communication – you can articulate complex problems clearly and precisely. This cannot be overstated.

Self-motivated and comfortable working independently on intellectually demanding tasks

What you don’t need

No prior AI training or RLHF experience required

No software engineering background needed – domain expertise and research instincts are what matter

Domains we’re especially looking for

Computational Biology / Bioinformatics

Genomics / Molecular Biology

Physics / Astrophysics / Signal Processing

Climate / Environmental Modeling

Healthcare / Medical Imaging

Neuroscience / Brain-Computer Interfaces

Materials Science / Chemistry

Finance / Quantitative Modeling

Robotics / Control Systems / Reinforcement Learning

Advanced NLP (specialized domains)

Mathematics / Statistics (applied)

Logistics

Fully remote – work from anywhere

$200-$400/hr depending on domain and seniority

10-40 hrs/week, hourly contract

Assessment required – paid if approved

Independent contractor (1099) – not compatible with F-1 OPT, STEM OPT, or visa statuses requiring W-2 employment or employer sponsorship

This is a project-based, freelance opportunity with no guaranteed hours. We recommend keeping other work options open while waiting for project assignment.

Company:

G2i Inc.

Qualifications:

Language requirements:

Specific requirements:

Educational level:

Level of experience (years):

Senior (5+ years of experience)

Tagged as: , , ,