Magi AI Content Quality Evaluator – En-US (Freelance, Temporary)

Posted 2025-08-15
Remote, USA Full Time Immediate Start
Job Description:

We are hiring 20 Freelance Evaluators to support a high-priority Magi AI Evaluation Pilot Project for Uber’s AI team. This role involves reviewing and rating AI-generated responses for quality, clarity, accuracy, and factual correctness across varying levels of complexity.

All candidates must meet Expertise Level 3 (L3) requirements—the highest evaluator standard—to qualify for this project. Your evaluations will directly shape how AI systems learn, improve, and interact with users.

What is Expertise Level 3 (L3)?

This role requires Expertise Level 3, meaning the evaluator must have:

Advanced analytical and research skills

The ability to handle highly complex content, such as technical or domain-specific materials (e.g., science, legal, medical, or data-heavy text)

Experience evaluating multi-modal data (charts, PDFs, screenshots, etc.)

Capability to conduct line-by-line fact-checking and justify responses with critical reasoning

Comfort working with structured rubrics and independently verifying factual accuracy

Compensation:

Pay Rate: $25 – $31 per hour (USD)

The assessment is paid if the evaluator passes and completes at least 4 hours of work on the project

Key Responsibilities:

Evaluate AI-generated responses across varying task types, using structured guidelines

Identify tone, style, factual, and product-specific issues in output

Perform detailed comparisons, fact-checks, and accuracy reviews

Submit ratings using tools such as dropdowns, screencasts, and feedback forms

Meet tight deadlines (all assigned work must be completed within 4 business days)

Task Complexity & Time Commitment:

Tasks will range from simple to complex, but all evaluators must qualify at the L3 expertise level

Average Handling Time: 75–145 minutes per task

Minimum expectation: 3+ hours per task cycle, with option to take on more work

Work is asynchronous, though task assignment may align with IST time zone

Requirements:

Professional and expert-level English (US) speaker (can reside in or outside of the U.S.)

5+ years of experience in linguistics, research, writing, content evaluation, or technical review

Master’s degree or PhD strongly preferred

High attention to detail, accuracy, and critical thinking

Secure internet connection and workspace

Must complete and pass a qualifying assessment at the Expertise Level 3 standard

Assessment Details:

Approx. 30 minutes to complete

Paid if passed and evaluator completes at least 4 hours of project work

Project Details:

Project Name: Magi AI Evaluation (Uber Special Project)

Work Type: Freelance, Temporary

Start Date: Within 48 hours of onboarding completion (target: April 28, 2025)

Schedule: Flexible, asynchronous

Location: Global (must be a native En-US speaker)

Duration: Initial pilot cycle, with possible future work based on performance and need

Job Types: Part-time, Temporary

Pay: $25.00 - $31.00 per hour Apply tot his job
Back to Job Board