Building a brand spanking new platform based on AWS that will be Advanced Analytics, Data Science and ML ready.
All data from across the business will be ingested into the AWS platform over the next three years.
This enterprise-wide platform will be replacing legacy systems with cloud capability transforming this company and allowing it to focus on data lead decisioning.
The platform will enable Data Science as a Service.
The current state of play
The strategy has been defined and roadmap built but from a build and delivery perspective, this is greenfield. Data pipeline building is the main focus to get legacy data onto the new platform. Once data is across the deployment of algorithms will be needed but this is a lot further down the line.
Tools, Languages and Tech of the new platform (Experience in all is not necessary)
- The platform is AWS so S3, Athena, Redshift, Glacier, EMR, EC2, RDS
- Pipelines built using Spark
- Streaming in Kafka
- Workflow in Airflow
- Languages used are Python, Java, Scala and SQL
The important bit…experience needed from you
From a data engineering perspective, this is the ETL and pipeline building phase so you must have experience building robust data pipelines.
Obviously, if you have experience moving on-prem to AWS that would be ideal but If you have experience moving an on-premise ecosystem to a distributed cloud, I want to speak to you as well. In other words, if you have Hadoop (Cloudera, MapR or Hortonworks) engineering/development experience.
I need experience with either Python, Java or Scala and you must be solid with SQL.
Ideally, you will have come from a software engineering background (could be education, could be commercial) and moved into Data over the past few years but this is not essential.
This is 100% delivery focused. I’m looking for builders and doers. Hands-on people that want to look back in a couple of years and can confidently say “I delivered an AWS based platform that is ML ready for an FSTE 100 business – from scratch.” This is a genuinely exciting project for data engineers so if you would like to hear more about it please get in touch and I’ll talk you through everything. I’ve been working with this business for a couple of years now and I even placed the hiring manager so can really go into detail.