Staff Ai Engineer

San Francisco, California

US$250000 - US$300000 per year

Full time

Ref: 00567_1774369684

We're building an AI-native platform focused on helping professionals make complex, high-stakes decisions with greater clarity and confidence.

This is not an AI "feature." AI is the product.

As a Staff AI Engineer, you will serve as a technical leader responsible for designing, building, and hardening the core intelligence systems behind the platform-systems that directly support real-world decision-making in environments where accuracy and trust are critical.

This is a hands-on role for someone who wants to operate at the edge of what's reliable in applied AI and push those boundaries into production. You will own systems end-to-end: from architecture and modeling decisions through deployment, evaluation, and iteration. You'll also help define technical standards and influence how AI systems are built across the organization.


What You'll Work On

  • Architecting LLM-powered, agentic systems for research, analysis, and decision support
  • Designing hybrid reasoning pipelines that combine language models with retrieval systems, structured data, deterministic logic, and external tools
  • Building robust RAG pipelines over unstructured, noisy, and proprietary datasets
  • Developing evaluation frameworks to measure reasoning quality, faithfulness, latency, and cost
  • Implementing observability, debugging, and failure handling for multi-step AI workflows
  • Translating ambiguous user needs into reliable, production-grade intelligent behavior in collaboration with product and design
  • Raising the bar for AI engineering practices through technical leadership and mentorship

Example Problem
Design and build an AI system capable of synthesizing diverse data sources-documents, structured datasets, and external signals-into actionable, well-supported insights, while explicitly surfacing uncertainty and tradeoffs.


Why This Is Challenging

  • Product complexity: The goal is to deliver a system users rely on daily-not a demo or internal prototype
  • High-stakes environment: Outputs must be accurate, explainable, and calibrated-"mostly correct" is insufficient
  • Data ambiguity: Inputs are often incomplete, inconsistent, or contradictory, with no single source of truth
  • Reasoning over generation: The focus is on systems that evaluate, compare, and justify-not just generate fluent responses
  • Agent reliability: Multi-step, tool-using workflows must behave consistently in production environments
  • Evaluation is evolving: You will help define how to measure quality when traditional ML metrics fall short
  • Trust as a requirement: Explainability, traceability, and failure handling are core system properties-not afterthoughts

What We're Looking For

  • 6+ years of software engineering experience with significant hands-on work in applied AI/ML systems
  • Strong foundation in Python and backend system design
  • Experience working with LLMs, including areas like prompting, fine-tuning, RAG, agentic workflows, or evaluation tooling
  • Track record of owning ambiguous, high-impact systems from concept through production
  • Ability to make thoughtful architectural tradeoffs in real-world environments
  • Systems-level thinking combined with a bias toward shipping high-quality implementations
  • Strong product intuition and a sense of responsibility for end-user outcomes

Bonus Experience

  • Background in data-intensive products or regulated environments
  • Exposure to domains where correctness, traceability, and trust are critical

Oscar Associates Limited (US) is acting as an Employment Agency in relation to this vacancy.

Apply today.

Share job