Back to articles

Deep Signal Quarterly – Q3 2025

16th October 2025

By Steve Kilpatrick
Founder & Director
ML Systems and Infrastructure

474 Tracked Repos | 111,083 Commits | 9,457 Contributors

THE SIGNAL

Our Q2 predictions: the kernel-compiler hybrid project crossing 50 contributors missed badly (reached 24). The talent pool is concentrating in fewer hands than expected. The cloud provider AMD kernel team raid hasn’t materialised yet, but AMD’s contributor base grew 26%, making it increasingly plausible. Agent framework maturation is tracking: commits grew while contributors declined, and agent-adjacent infrastructure (evaluation, guardrails) is absorbing the serious engineering work. One miss, one punt, one on track.

The quarter’s headline: LLM serving pulled 1,896 unique contributors (up 27% QoQ) while training frameworks shed contributors for the first time in our tracking history. That divergence tells you something compensation benchmarks can’t: the engineers who matter most are migrating downstream. Churn in serving dropped 90% QoQ. Contributors who arrived are staying. Mean active weeks sits at 8.3, meaning the average contributor was engaged for more than half the quarter. These aren’t drive-by PRs. The people building serving infrastructure are embedding deeply, and extracting them will require more than a compelling job description.

This edition tracks 9,457 contributors across 474 tracked repositories and 111,083 commits.

Q-over-Q Snapshot

Serving and kernel categories absorbed the quarter’s growth; training frameworks and compilers held steady while the contributor base quietly redistributed around them.

Category	Commits	Contributors	Active Repos	Commits QoQ	Contribs QoQ
GPU Kernels & Performance	11,945	1,239	62	+22%	+26%
ML Compilers & Graph Optimization	14,239	764	34	+0%	-1%
Distributed Training & Parallelism	5,661	547	26	+56%	+11%
Inference Runtimes & Engines	2,658	375	18	+16%	-3%
LLM Serving & Inference	14,503	1,896	59	+34%	+27%
Training Frameworks & Model Architecture	23,303	2,344	68	+7%	-3%
ML Platform & Orchestration	7,972	858	27	+12%	+4%
Edge & On-Device ML	10,823	579	32	+11%	-4%
Model Optimization & Compression	3,815	309	22	+32%	+15%
Hardware-Software Co-Design	11,000	1,241	24	+12%	+15%
ML Debugging & Tooling	2,934	343	17	+38%	+11%
Agent Framework	2,230	359	6	+26%	-4%

What’s Moving

🚀 LLM Serving & Inference

Production serving has become the primary arena where ML systems talent concentrates. The engineering across top projects tilted toward runtime internals, scheduler logic, and kernel-level integration rather than model-loading glue. vllm focused on async scheduling, cache reuse, and batch shaping for bursty workloads. SGLang drew 316 contributors (216 new) into structured generation scheduling and speculative decoding. Both show high coordination patterns correlating with complex, multi-system changes.

Hardware-specific serving forks accumulated substantial commits from dedicated contributors concentrated in model implementation and test infrastructure. Non-NVIDIA serving is no longer experimental. A major cloud provider’s orchestration project pulled 120 contributors into systems-level serving work. Three distinct serving stacks, three hardware targets, three separate talent pools forming. The project-level contributor flow data underneath this number is something we reserve for teams we’re working with on active searches.

⚙️ GPU Kernels & Performance

AMD’s kernel investment crossed a threshold that reshapes GPU kernel hiring. Multiple AMD projects onboarded over 140 new contributors, with engineering concentrated in operator implementations and custom attention kernels. That’s a talent factory. On the NVIDIA side, the engineering shifted toward library-level primitives and Python bindings rather than raw kernel authoring, suggesting a maturing abstraction layer.

One project drew 95 contributors (82 new) into attention kernel work bridging kernel engineering and serving infrastructure. The overlap between kernel and serving contributors hit 104 this quarter; that cross-pollination is where the scarcest talent lives. An emerging project with all-new contributors is building what appears to be LLM-assisted kernel generation tooling. Seven contributors, early days. But the convergence of agent frameworks and kernel engineering is a theme we expect to accelerate.

🧪 Training Frameworks & Model Architecture

The largest category by volume is quietly hollowing out. Commits grew 7% while contributors contracted 3%. Documentation now dominates the file mix across major repos, outpacing core code changes. PyTorch’s engineering focused on compiler integration. TensorFlow’s commits concentrated on compiler backend consolidation. The frameworks are stabilising; the frontier engineering is happening in categories they feed into.

Hugging Face transformers pulled 299 contributors (175 new), but the work skewed toward model integration and template maintenance. The project functions as an onboarding surface, not a site of deep systems engineering. A dedicated RL repository attracted 84 contributors doing genuine distributed training work with high test coverage. That separation validates our prediction: RLHF infrastructure is forking into its own discipline.

🔌 Hardware-Software Co-Design

One major hardware startup expanded faster than any other vendor this quarter. Their runtime project drew 215 contributors (70 new) spanning runtime internals, operator libraries, and test infrastructure. Their compiler project added 78 contributors with tightly coordinated development. Combined, they represent a growing pool of engineers who understand both custom silicon and ML compiler pipelines.

Intel’s presence remained broad, with over 340 contributors across compiler and backend projects. Mean active weeks of 9.4 across the category tells you the engineers who stick around are committed. Contributor growth hit 15% with a 37% churn increase: new people are arriving but the revolving door also spins. Qualifying depth of engagement matters more than ever when sourcing from this pool.

⚡ Distributed Training & Parallelism

The commit surge reflects a build sprint, not a talent influx. A major actor-mesh project led with engineering spanning runtime abstractions, kernel integration, and debugging infrastructure. The breadth of that work makes these contributors unusually versatile hires. The 87% churn increase means engagement windows are compressing; teams hiring distributed systems engineers should move quickly when they identify active contributors.

Quiet Corners

ML Compilers held flat: zero change in commits, zero in contributors. Mean active weeks of 9.5, the highest of any category. Compiler talent isn’t growing; it’s entrenching. Edge & On-Device ML grew commits 11% but shed contributors; backend integration and quantisation-aware inference remain the dominant themes. Model Optimization jumped 32% in commits driven by quantisation project expansion. ML Debugging grew 38% in commits; profiling tools and LLM benchmarking absorbed most energy.

ML Platform grew steadily with the deepest engagement in the dataset. Inference Runtimes ticked up on commits but lost contributors. Agent Framework grew commits 26% while shedding contributors; the experienced base contracted while new arrivals focus on integration work.

Where Talent Is Moving

The strongest overlap runs between compilers and training frameworks: 202 contributors worked across both. That number reflects the ongoing compiler-ification of training infrastructure. Kernel engineers are splitting time between serving (104 shared contributors) and hardware co-design (126 shared). The kernel-to-serving pipeline is particularly actionable: fewer than 110 people in the open-source record this quarter who understand both attention kernel performance and production scheduling constraints.

A new overlap pair emerged between agent frameworks and ML compilers, likely reflecting compiler engineers experimenting with LLM-driven code generation rather than genuine talent migration. Worth watching, not yet hiring-relevant.

Talent Migration: Contributor Overlap Between Categories

What This Means If You’re Hiring

Serving engineers with hardware-specific depth are the quarter’s most constrained hire. The 90% churn reduction means the people building serving infrastructure are locked in. They’re averaging over eight active weeks, shipping collaborative changes to scheduler internals. Prying them loose requires understanding what they’re working on at a granular level. Generic “inference at scale” pitches will not land. Staff-level serving specialists command $700K to $1.3M total comp.

Kernel engineers are productive and expanding (26% more contributors), but the growth concentrates in AMD’s projects. If your stack is NVIDIA-only, the addressable pool didn’t grow for you. The kernel-to-serving overlap represents the most versatile systems engineers in ML; hiring from this intersection requires speed and specificity.

Compiler engineers are the steadiest cohort in the dataset: flat counts, highest engagement, minimal churn. The 764 unique contributors represent a mature, stable, and extremely difficult-to-recruit population. ML compiler roles command $250K to $450K+ total comp, with a 30-50% premium that shows no sign of compressing.

If any of these signals are showing up in your own sourcing data, we should compare notes before Q4.

Predictions

Q4 2025: SGLang will surpass 400 unique contributors, forcing serving teams to maintain pipelines across two competing stacks rather than treating vllm as the default talent source.
Q4 2025: At least one hardware startup’s compiler contributor will surface in a top serving project’s commit history, marking custom-silicon compiler talent flowing into mainstream inference work.
By mid-2026: The distributed training pool will bifurcate into “scale-out parallelism” engineers and “actor-mesh runtime” engineers, with less than 15% overlap between the two groups.

Inference is eating the talent pool that training built. The hiring teams that understand the current will outperform those still fishing in last year’s waters.

This report is powered by D33P S1GNL: a proprietary contributor intelligence engine. For access to the full contributor-level dataset or to discuss ML Systems hiring, contact [email protected]

Get our latest articles and insight straight to your inbox

Hiring Machine Learning Talent?

We engage exceptional humans for companies powered by AI

Find Out More > View Jobs >