Staff Ai Engineer

We're building an AI-native platform focused on helping professionals make complex, high-stakes decisions with greater clarity and confidence.

This is not an AI "feature." AI is the product.

As a Staff AI Engineer, you will serve as a technical leader responsible for designing, building, and hardening the core intelligence systems behind the platform-systems that directly support real-world decision-making in environments where accuracy and trust are critical.

This is a hands-on role for someone who wants to operate at the edge of what's reliable in applied AI and push those boundaries into production. You will own systems end-to-end: from architecture and modeling decisions through deployment, evaluation, and iteration. You'll also help define technical standards and influence how AI systems are built across the organization.

What You'll Work On

Architecting LLM-powered, agentic systems for research, analysis, and decision support
Designing hybrid reasoning pipelines that combine language models with retrieval systems, structured data, deterministic logic, and external tools
Building robust RAG pipelines over unstructured, noisy, and proprietary datasets
Developing evaluation frameworks to measure reasoning quality, faithfulness, latency, and cost
Implementing observability, debugging, and failure handling for multi-step AI workflows
Translating ambiguous user needs into reliable, production-grade intelligent behavior in collaboration with product and design
Raising the bar for AI engineering practices through technical leadership and mentorship

Example Problem
Design and build an AI system capable of synthesizing diverse data sources-documents, structured datasets, and external signals-into actionable, well-supported insights, while explicitly surfacing uncertainty and tradeoffs.

Why This Is Challenging

Product complexity: The goal is to deliver a system users rely on daily-not a demo or internal prototype
High-stakes environment: Outputs must be accurate, explainable, and calibrated-"mostly correct" is insufficient
Data ambiguity: Inputs are often incomplete, inconsistent, or contradictory, with no single source of truth
Reasoning over generation: The focus is on systems that evaluate, compare, and justify-not just generate fluent responses
Agent reliability: Multi-step, tool-using workflows must behave consistently in production environments
Evaluation is evolving: You will help define how to measure quality when traditional ML metrics fall short
Trust as a requirement: Explainability, traceability, and failure handling are core system properties-not afterthoughts

What We're Looking For

6+ years of software engineering experience with significant hands-on work in applied AI/ML systems
Strong foundation in Python and backend system design
Experience working with LLMs, including areas like prompting, fine-tuning, RAG, agentic workflows, or evaluation tooling
Track record of owning ambiguous, high-impact systems from concept through production
Ability to make thoughtful architectural tradeoffs in real-world environments
Systems-level thinking combined with a bias toward shipping high-quality implementations
Strong product intuition and a sense of responsibility for end-user outcomes

Bonus Experience

Background in data-intensive products or regulated environments
Exposure to domains where correctness, traceability, and trust are critical

Oscar Associates Limited (US) is acting as an Employment Agency in relation to this vacancy.

What You'll Work On

Why This Is Challenging

What We're Looking For

Bonus Experience

Apply today.

More Jobs for you

Lead Data Platform Engineer

Lead Data Platform Engineer

Senior Payroll Advisor

Finance Manager

Finance Manager

VP of Finance