About the Company

A fast-growing, venture-backed startup is building a next-generation AI compute platform focused on decentralized, high-performance infrastructure. The company is rethinking how organizations access and scale compute by integrating global data centers into a unified, serverless platform.

Their mission is to democratize access to AI compute and provide an end-to-end lifecycle solution-from raw data to deployed models-through a combination of platform infrastructure and forward-deployed engineering.

With a global footprint and early traction, the team is tackling challenges across multi-cloud orchestration, GPU scheduling, and enterprise-grade infrastructure, with a strong focus on security and compliance.

The Role

This is a high-impact infrastructure role focused on designing and scaling distributed systems that power AI/ML workloads at scale.

You'll work across:

Core platform architecture
Multi-cloud compute orchestration
Managed services development
Customer-facing deployments

This role requires a strong mix of systems engineering + product thinking, with exposure to both backend infrastructure and end-user experience.

What You'll Work On

Compute Platform & Multi-Cloud Architecture

Design abstraction layers across cloud providers (AWS, GCP, Azure, bare-metal)
Build systems that unify compute, storage, and networking across environments
Expand global compute capacity by integrating with cloud and data center providers
Architect reusable, composable infrastructure components

Managed Services & Platform Development

Own services end-to-end (design → deployment → monitoring)
Build orchestration systems for GPU workloads and container scheduling
Develop APIs and control planes for provisioning, scaling, and lifecycle management
Drive improvements in performance, reliability, and cost efficiency

Infrastructure & Platform Services

Build systems for billing, usage tracking, and cost attribution
Develop observability tooling (metrics, logging, tracing)
Establish engineering standards and best practices
Mentor engineers and contribute to system design decisions

What They're Looking For

Core Requirements

4+ years building distributed systems, backend infrastructure, or cloud platforms
Strong experience with AWS, GCP, or Azure
Deep understanding of:
- Compute (VMs, instances)
- Storage (object, block, file systems)
- Networking (VPCs, load balancers, security groups)
Experience with Kubernetes and container orchestration
Strong programming skills (Golang preferred; Python/Rust a plus)
Experience building APIs, control planes, or platform services
Familiarity with databases (Postgres, Redis, etc.) and messaging systems (Kafka, RabbitMQ)

Nice to Have

GPU orchestration or AI/ML infrastructure experience
HPC or cluster management (Kubernetes, Slurm)
Data engineering or large-scale ETL systems
Systems-level programming (low-level infra, operators, daemons)
ML platform engineering (training/inference pipelines)
Experience deploying into enterprise or on-prem environments

Oscar Associates Limited (US) is acting as an Employment Agency in relation to this vacancy.

Senior Software Engineer, Compute Platform

About the Company

The Role

What You'll Work On

Compute Platform & Multi-Cloud Architecture

Managed Services & Platform Development

Infrastructure & Platform Services

What They're Looking For

Core Requirements

Nice to Have

Apply today.

More Jobs for you

BI Developer

PHP Developer

Senior Back End Developer

Lead Front End Developer

NetSuite Functional Consultant

IT Support Engineer / Multi-Site Support Technician

Software Engineer

Senior Infrastructure Engineer