We are looking for an Data Engineer to push the boundaries of machine learning and simulation at Amazon scale. As a member of People Experience Technology and Finance, you will help drive ML forecasting and prediction related improvements by delivering state-of-the-art system and model optimization techniques in collaboration with multiple Amazon science and engineering teams. Candidate should be passionate about technology, innovation, and customer experience, and is ready to make a lasting impact on the ML driven solutions and business intelligence. You'll be working with talented scientists, engineers, and product and finance managers to innovate on behalf of our customers. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey and change the world of distributed simulations for population dynamics forecasting and expense planning.
This is a high-impact and visibility role where you will lead development of applications that will be used by planners and decision makers across Amazon.
Key job responsibilities
- Create scalable data solutions using AWS and/or Amazon internal tools.
- Create and maintain business logic for data transformation and ingestion pipelines.
- Create and maintain datasets in Amazon data lake and internal data management systems using S3/Glue.
- Data investigation and analysis to understand the impact of changes on downstream customers' use cases and build transformation logic to ingest data for Forecasting data analysis.
- Provide quality data for Amazon downs team data consumers.
A day in the life
As a member of the Finance Forecasting team, you'll play a key role in solving one of the world's most complex technical challenges in data engineering. You will utilize large-scale compute platform to build big datasets used in distributed systems for machine learning and statistical analysis. Our Data Engineer needs to be able to gather and understand data requirements, build and maintain big data sources to prepare data for machine learning models, data scientists, business intelligent engineers, and work with software engineers to achieve high quality data ingestion and transformation solutions.
Successful candidates should come from a strong data engineering background. You need to have experience with structured data, and being able to analyze/transform the data using various tools. Your analytical skills and knowledge of schema, metadata and data structure in analytical data world will be essential.
As a data engineer, you will need to design and develop high scalable ETLs with EMR, Spark,Pytorch based applications as well supporting them on Glue ETL or Redshift. Knowledge of big data architecture , and design is a must.