AWS Infrastructure Services (AIS) owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
About DC Bridge:
Join the DC Bridge team in pioneering next-generation AI/ML solutions that power AWS's global data center operations. We're building cutting-edge systems that orchestrate physical work processes across AWS's worldwide data centers, directly impacting millions of customers who rely on AWS services.
Position Impact:
You'll be at the forefront of transforming data center operations through AI/ML innovations, developing intelligent systems that optimize technician workflows, automate decision-making processes, and enhance operational efficiency across AWS's global infrastructure.
Required Qualifications:
• 5+ years of software development experience with proven expertise in Python, Java, or equivalent languages
• Strong background in one or more of:
- Frontend development and UI/UX design
- Platform engineering (SDKs/Frameworks)
- ETL and large-scale data processing
- DC telemetry systems
- Machine learning (specializing in anomaly detection, classification, time series analysis)
- Solution architecture and technical advisory
• Hands-on experience with modern ML frameworks (TensorFlow, PyTorch, SageMaker, Bedrock)
• Track record of mentoring and technical leadership
• Excellence in problem-solving and communication
Ideal Candidate Profile:
• Thrives in ambiguous environments and adapts quickly to change
• Demonstrates a scrappy mindset with ability to deliver results in fast-paced settings
• Maintains deep technical expertise while staying customer-focused
• Shows passionate engagement with AI/ML advancements
• Possesses strong understanding of AI/ML technology application (LLMs, agents, RAG, ML models)
• Works autonomously and demonstrates deep problem-solving capabilities
• Balances subtle improvements with disruptive innovation when needed
This role offers the opportunity to shape the future of AWS data center operations through innovative AI/ML solutions while working with cutting-edge technologies at unprecedented scale.
Key job responsibilities
• Lead and architect the development of state-of-the-art AI/ML platforms and solutions, serving as a technical leader for both data center operations and engineering teams
• Own end-to-end delivery of technically challenging projects, including scalable ML frameworks, deployment pipelines, and intuitive interfaces for non-ML experts
• Drive operational excellence by building robust data processing pipelines and ETL systems for DC telemetry data, while identifying and addressing operational challenges early
• Mentor and grow junior engineers, acting as a force multiplier by sharing AI/ML expertise and best practices
• Lead cross-functional collaboration efforts to integrate ML solutions into existing DC workflows, while maintaining highest standards for system extensibility and scalability
• Contribute to and champion improvements in development processes, particularly in the context of ML development and deployment
• Provide technical consultation and architectural guidance to internal customers while insisting on the highest standards for long-term system sustainability
• Design and implement reusable components and tools that enhance team productivity and system reliability
About the team
About AWS
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Why AWS?
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.