Lab Infrastructure and Operations Manager, Annapurna Labs Silicon

Amazon • Austin, TX, US • 1m ago

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform — and at the heart of it is custom silicon. Annapurna Labs, an AWS organization with development centers in the U.S. and Israel, designs the custom chips (Graviton, Trainium, Inferentia, Nitro) that power millions of customers worldwide. Our team combines cloud-scale innovation with world-class expertise across silicon engineering, hardware design, verification, software, and operations to solve technical challenges no one has tackled before.

We are seeking a Sr. Technical Manager to lead a team of networking development engineers, data center operations technicians, and facility engineers responsible for designing, building, operating, and scaling the critical lab and data center infrastructure that accelerates silicon development and validation. You will own the strategy, design, construction, and ongoing operations of lab environments spanning hundreds of rack positions, megawatts of power capacity, and specialized testing environments including thermal chambers, liquid cooling systems, and high-density compute clusters.

This is a hands-on technical leadership role. You'll drive the physical infrastructure buildout — power distribution, cooling architecture, structured cabling, network fabric design, and environmental monitoring — while simultaneously managing the operational excellence of a live, production-class lab environment. You'll operate in ambiguous spaces where the problems aren't pre-defined, defining goals, building teams, establishing processes, and influencing stakeholders across hardware, software, facilities, and infrastructure organizations to deliver results that directly impact AWS's ability to ship custom silicon faster.

Key job responsibilities
Data Center & Lab Infrastructure Design and Buildout

Lead end-to-end design and construction of lab and data center environments — power (MW-scale), cooling (air/liquid), structured cabling, and network fabric. Define technical requirements including electrical capacity planning, thermal modeling, rack density, and redundancy architectures. Drive infrastructure decisions across UPS, PDUs, generators, switchgear, chillers, and emerging technologies (liquid cooling, DC distribution). Own full lifecycle from design through commissioning and decommissioning.

Data Center Operations & Network Infrastructure

Lead networking engineers in designing and operating high-performance lab network fabrics (spine-leaf, 400G+). Own operational excellence — availability, capacity management, change management, and incident response. Establish monitoring/alerting across all systems, define SLAs, and drive automation of operational workflows including infrastructure-as-code and predictive maintenance.

Strategic Leadership & Execution

Lead a multi-disciplinary, multi-location team of networking engineers, DC operations, and facility engineers. Define vision and goals working backwards from silicon engineering needs. Attract and develop exceptional talent; build future leaders through coaching and delegation. Manage trade-offs between tactical operations and long-term strategic buildouts.

Process Improvement & Cross-Functional Influence

Establish scalable, repeatable processes for facility operations and lab provisioning. Drive standardization to reduce mean-time-to-provision. Influence senior leaders through written narratives and partner cross-functionally with silicon engineering, hardware design, security, and real estate teams.

A day in the life
No two days look the same. You might start the morning reviewing power and cooling dashboards, analyzing utilization trends across your facilities, and triaging a thermal alert in a high-density rack zone. Mid-morning, you're in a design review with your networking engineers evaluating a new topology for an upcoming lab expansion. After lunch, you're walking the data center floor with your facility engineers, inspecting a new liquid cooling loop installation and reviewing commissioning test results. Later, you lead a cross-functional planning session with silicon and hardware teams on capacity requirements for the next chip program — translating their compute and power needs into concrete infrastructure builds. You close the day with 1:1s focused on career development, a quick sync on a vendor negotiation for critical power equipment, and approving a change request for a weekend network maintenance window.

Apply

Connect with us: