Senior ML Inference Engineer – Computer Vision / Medical Imaging
California, San Francisco, USA
$200,000 – $350,000 DOE + equity
Our client is a fast-growing AI company building cutting-edge technology to transform image-based clinical and research workflows. Their platform applies advanced machine learning to highly complex visual data, helping modernise legacy processes and deliver faster, more scalable outcomes in a highly regulated environment.
The role:
A critical role focused on reducing latency, improving throughput, and ensuring models can run efficiently across both cloud and edge environments. You will work closely with ML engineers and own the performance layer of a production system handling extremely large image data.
What you will do:
Analyse model architectures to identify and remove inference bottlenecks
Optimise end-to-end pipelines for processing extremely large image data, including tiling, processing, and stitching workflows
Deploy and scale optimised models across multi-GPU infrastructure and edge hardware
Work closely with ML engineers to improve model efficiency using PyTorch, Torch Compile, and TensorRT
Support production readiness in a regulated environment with a focus on compliance and reliability
What you will need:
5 to 10 years of experience in ML engineering or ML inference engineering
Strong experience optimising production inference systems
Deep knowledge of GPU programming and performance tuning
Hands-on experience with PyTorch and TensorRT
Experience deploying models in cloud and edge environments
Strong Python skills
Background in high-performance or large-scale ML systems
For more information please reach out to [email protected]
Share Job
Know someone who may be interested?
