Machine Learning Engineers: Get Paid $80-$220/hr to Review AI Code

Open application· 71 spots this round

$80-$220/hr machine learning engineering work, on your schedule

Review AI training code and pipelines for the label leak, the eval that overstates, the serving path that silently degrades. The judgment from running real models in production. Paid hourly, remote, a few hours a week.

Fully remoteYour scheduleWeekly pay

Apply nowApply once, get matched on a rolling basis. No prior AI experience needed.

Trusted by top research companies

Rated by experts worldwide

4.4/ 5

Based on 18 verified reviews

4.9/ 5

Based on 36 verified reviews

Hi, we're Zac and Jack, the founders of Terac. We want to talk to you directly, because you are the most important part of what we're building.

Terac is a community of experts. People who have spent years getting good at something specific and hard. The world is about to need more of you, not less. As AI takes on more of the world's work, the bottleneck shifts to the people who actually know what they're talking about.

Expert labor is the rarest resource in the world right now, and it is shockingly hard to find. The companies that need an ML engineer's eye on a training pipeline that leaks labels spend weeks chasing people, paying placement fees, and settling for whoever is available. Meanwhile thousands of qualified people are sitting with knowledge that no one ever asks for.

That gap is what we're here to close. Every project that lands on Terac is routed to the people who actually know the answer, on their schedule, paid fairly, and only when the work is verified. No middleman taking a cut of your time. No vague gigs. No chasing checks.

We care about every single person in this community. If you join Terac, you're not a row in a database to us. We read the feedback. We answer the emails. We will fight for you when a customer is being unreasonable, and we will be honest with you when something on our side is broken. The quality of this panel is our entire company, and we owe you a serious bar.

If you've made it this far, here is what we're asking: claim your profile. Put your expertise on the record. Let the world's most ambitious teams come find you for the work only you can do.

Zac & Jack

Founders

ML Engineering questions

Still curious? Write to us at support@terac.com.

Niche depth is often the most valuable. Labs already have plenty of generalist feedback and real gaps in post-training alignment, efficient inference, and novel architectures, so your depth is an asset. You are matched to tasks in your sub-domain, not a generic queue.

Typically reviewing PyTorch or JAX training scripts, evaluating loss function correctness, critiquing explanations of gradient flow or backprop, and judging whether a model card accurately describes training data and eval methodology. Some projects ask you to write worked examples, like debugging a vanishing-gradient issue or designing an eval suite for a RAG system.

Industry certs like the Google Professional ML Engineer and AWS ML Specialty are accepted as supporting evidence, weighted alongside your work history and any published research or open-source work. A PhD is not required, and engineers with strong industry credentials and production experience are routinely onboarded.

Some tasks do involve sensitive topics: fairness audits, toxicity classifier outputs, and whether a model correctly refuses harmful requests. You will not be asked to generate harmful content. If a task conflicts with your ethics or employer policy, decline it without penalty and get routed elsewhere.

MLOps and inference-infrastructure depth is genuinely in demand: reviewing latency benchmarks, checking ONNX or TensorRT export workflows, and assessing serving advice on Triton or vLLM. No research background needed, and production engineers often catch what research-oriented reviewers miss.

Why your expertise matters

A model writes a training script with a misapplied loss function, a data leak in the feature pipeline, or a confident but wrong claim about calibration. A general reviewer accepts it; a working ML engineer catches it. Knowing when an architecture fits a given scale, latency, and data constraint is exactly what training corpora lack. Your judgment fills that gap.

How pay works

The $220 ceiling goes to depth in areas thin in training data: production MLOps, RLHF, on-device optimization. Work is remote, billed by the verified hour, and paid once Terac confirms the deliverable meets scope, so there are no invoicing delays tied to client approval.

What the work looks like

A sample of the machine learning engineering work you would pick up. Every project is scoped, remote, and paid on verified completion.

Review a model's PyTorch training script and flag gradient accumulation logic that goes silently wrong at large batch sizes.
Evaluate a machine-drafted model card and correct errors about dataset composition, eval splits, and metric definitions.
Write a worked example of detecting and fixing target leakage in a tabular pipeline, with the pandas and scikit-learn patterns an experienced engineer uses.
Assess an AI explanation of distributed training and flag where ZeRO optimizer stages get conflated with gradient checkpointing.
Build a reasoning trace for diagnosing a training instability, covering loss curves, gradient norms, learning rate schedules, and each decision.
Judge whether a model's answer on fine-tuning versus RAG accounts for latency, data freshness, and the deployment environment.