We are looking for an experienced Software Engineer to join our AI data gathering team and work on supporting exciting generative AI projects.
As a Software Engineer on our team, you will be responsible for designing, developing, and maintaining our systems that acquire content for information grounding and training foundational models. You'll work closely with Science partners on data requirements for training their models. This position requires creativity, passion, and experience with building innovative solutions to complex technical problems.
Key job responsibilities
• Design and develop scalable content acquisition and data extraction systems to acquire data.
• Build automation data pipelines and insights using big data frameworks (e.g., Spark) to acquire petabytes of data and visualize important KPI's to enable technical direction.
• Optimize our data architecture for scale, low latency, resilience and cost efficiency.
• Implement robust systems to process content and extract meaning.
• Develop data pipelines and infrastructure to support petabyte-scale datasets.
• Work closely with scientists and other engineers to rapidly prototype and deploy new algorithms.
• Write high quality, well-tested production code in languages like Python, Spark, Java, Scala.
Key job responsibilities
- Develop data pipelines and infrastructure to support petabyte-scale datasets throughout the ML model development lifecycle, from training to production deployment
- Work closely with scientists and other engineers to rapidly prototype and deploy new ML models
- Optimize architecture for performance, scale, resilience and cost efficiency
- Write high quality, well-tested production code in languages like Python, Java, Scala.
A day in the life
As a Software Developer, you will be leading a team of software engineers and collaborating with applied scientists to develop novel processes for constructing and enhancing structured information retrieval systems; and enable high precision/recall & low latency access to knowledge in Web content acquisition systems.
About the team
Our team powers Amazon’s CXs that require state of the art web information retrieval. We enable high-quality large scale data ingestion, indexing, and retrieval across a range of AI and customer-facing applications.