Do you enjoy collaborating with teams to solve complex challenges?
Do you have a passion for cutting edge technologies and tackling system problems?
Join our highly skilled Network SRE team
We build and operate the Network infrastructure powering Akamai's global cloud platform. Our mission is to deliver reliable, scalable, and performant systems that enable customers to run critical workloads with confidence. As part of this team, you'll help ensure reliability at scale, maintaining the availability and resilience of services that billions of people rely on every day.
Partner with the best
In this role, you will lead a global team responsible for the reliability and performance of Akamai's next-generation cloud overlay network. We're looking for a thoughtful leader who blends technical depth with strategic vision and thrives in fast-moving, high-growth environments. You'll shape platform resilience by partnering across functions and championing SRE values by driving operational excellence, automation, and continuous improvement.
As a Director of Site Reliability Engineering - Network you will be responsible for:
- Leading and mentoring a globally-distributed team of Network SREs responsible for operating Akamai's cloud network at scale
- Defining, measuring, and evolving SLOs to align service reliability with customer experience, ensuring we understand and meet expectations before issues become incidents
- Promoting an automation-first culture rooted in SRE values, scaling efforts to reduce toil and improve change safety and speed
- Partnering with Product, Software Engineering, and Infrastructure to influence design, align priorities, clarify roles, and drive efficient delivery with a focus on reliability at scale.
- Leading SRE's role in incident management declaring incidents, acting as executive stakeholder, driving postmortem analysis, and ensuring remediation work is completed to prevent recurrence
Do what you love
To be successful in this role you will:
- 10+ years in engineering leadership roles within cloud providers, hyperscalers, or fast-growing technology organizations
- 5+ years of experience operating and scaling complex network infrastructure, including routing protocols, overlay networks, configuration management, and large-scale multi-tenant environments
- Have experience leading SWE or SRE teams support infrastructure, operations, and platform reliability
- Have experience in designing and implementing incident management practices, including on-call rotations, escalation paths, and postmortem processes
- Possess a foundation in systems engineering, with a focus on Linux, open-source infrastructure, and distributed systems
- Possess experience with observability tools like Prometheus and Grafana & configuration management tools like Salt and Ansible
- Have experience with cross-functional collaboration, including influencing product management, software engineering, hardware, and operations teams to align on priorities and deliver on reliability initiatives
- Have experience with workflow management practices and tools to support planning, tracking, and team collaboration
Work in a way that works for you
FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase, gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.
We power and protect life online, by solving the toughest challenges, together.
At Akamai, we're curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, you'll thrive here.
About us
Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.