Back to results

Senior ML Inference Engineer – GPU Systems / Compiler Optimisation

California, San Francisco, USA
$200,000 – $350,000 DOE + equity

Our client is a fast-growing deep-tech company building next-generation software tools for a highly specialised, precision-critical engineering domain. Their platform leverages advanced machine learning and high-performance computing to dramatically accelerate complex, computationally intensive workflows, delivering order-of-magnitude runtime improvements where accuracy is non-negotiable.

The role

A critical role focused on reducing latency, improving throughput, and ensuring ML models and high-performance systems run efficiently at scale. You will work closely with a small, elite engineering team and own the performance layer of a production system handling some of the most demanding computational workloads in the industry.

What you will do

Analyse model architectures and high-performance pipelines to identify and remove runtime and inference bottlenecks
Optimise end-to-end GPU pipelines, including custom ML model execution, kernel tuning, and data I/O workflows
Deploy and scale optimised systems across multi-GPU infrastructure
Collaborate closely with ML engineers to improve model efficiency using PyTorch, CUDA, and low-level GPU tooling
Support production readiness with a focus on correctness, reliability, and continuous performance improvement

What you will need

5 to 10 years of experience in ML engineering or high-performance systems engineering
Strong experience optimising production inference or compute-intensive software systems
Deep knowledge of GPU programming and performance tuning
Hands-on experience with PyTorch and CUDA
Experience deploying and scaling systems across cloud and on-premise GPU infrastructure
Strong Python and systems programming skills (Rust or C++ a plus)
Background in high-performance or large-scale ML systems

For more information please reach out to [email protected]

Contact:
Alexandra Franken

apply now

Senior ML Inference Engineer – GPU Systems / Compiler Optimisation

California, San Francisco, USA $200,000 – $350,000 DOE + equity

Related jobs

California, San Francisco, USA
$200,000 – $350,000 DOE + equity