Annapurna Labs was a startup acquired by AWS in 2015 and is now fully integrated. If AWS is an infrastructure company, think of Annapurna Labs as the infrastructure provider of AWS. Our org spans silicon engineering, hardware design and verification, software, and operations. We've delivered AWS Nitro, ENA, EFA, Graviton, F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and scalable NVMe storage.
AWS Neuron is the complete software stack for AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them.
We're looking for a Software Development Engineer to help build and evolve machine learning tools that run, optimize, and analyze ML workloads on custom AI accelerators. You'll work across the stack, from infrastructure orchestration to developer-facing tooling - alongside hardware engineers, system architects, and ML researchers both within and outside Amazon.
Key job responsibilities
- Design and implement tooling for profiling, optimization, and resource management of ML workloads on custom accelerators.
- Build high-impact solutions that ship to a large and growing customer base.
- Participate in design discussions, code reviews, and cross-functional collaboration with hardware, software, and customer-facing teams.
- Create metrics, implement automation, and resolve root causes of software defects.
- Work in a startup-like environment where you're always focused on the most important problems.
About the team
This is a high-impact, high-visibility team where your work directly accelerates every Neuron team's ability to ship, effectively multiplying the output of 100+ engineers. We're a small, senior group actively building greenfield capabilities, which means significant design ownership for SDEs and the opportunity to own major components and drive architectural decisions. You'll work at the cutting edge of AI infrastructure, at the intersection of Kubernetes, custom silicon, and large-scale ML workloads.