Data Engineer

London, England, United Kingdom · Technology


Data Engineer

Department: Technology Team

Reporting to: Lead Data Engineer

Type of Contract: Permanent

Working Pattern: Full time

Location: London, Camden. This is not a remote working position, and you must have the right to work in the UK


FutureLearn is transforming access to education and is one of the world’s largest social learning platforms.

Based in London’s Camden Town, we offer online courses, programs and degrees run by over 100 leading universities from around the world including UCL, University of California Berkeley, University of Edinburgh, University of Melbourne, King’s College London, Purdue University and University of Groningen. We also offer online courses from specialist education providers including CIPD, UNESCO and the Raspberry PI foundation.

Our vision is a global community, where everyone learns together and enjoys access to the education they need to transform their lives. Our award-winning platform helps by provoking conversation around the course material, so that learners and educators learn as much from each other as from the material itself.

Since our launch in September 2013, we’ve run hundreds of courses that have attracted over 7.5 million learners from all over the world and we've seen over 19 million enrolments on our open online courses and are now working with our partners to pioneer a more modular and accessible approach to studying full degrees.

We are continuing our expansion as we make this journey from offering short online courses, through micro credentials to full online degrees and working with employers and governments to reduce skills gaps. The pace of this change is reflected in extremely rapid growth of our revenues.


Data engineers at FutureLearn work in the Data Platform Team, collaborating with data scientists, software engineers and analysts across the company.

The Data Platform Team builds and maintains tooling and infrastructure that supports decision making processes across the business and enables product improvements by providing a complete and consistent view of our business data.

We work in short sprints & regularly share, reflect on and iterate on our work. This helps us focus on shipping small, iterative changes and responding quickly to changing business or user needs.

Our tech stack consists of an ETL process written in Ruby and managed by Airflow which takes data from our production database (MySQL), Sendgrid and our S3 log archives and transforms them to a star schema in Postgres. Currently we process about 4GB of production data and over 10GB of new log data daily. Our data warehouse contains around 1.5 billion rows of data. We also manage and maintain a course recommendations service written in Python.


What you’ll be doing

As a data engineer at FutureLearn you’ll work in a small agile team, collaborating with other software engineers, data scientists and analysts across the company.

You’ll be managing, monitoring and improving our existing ETL process and data warehouse design, implementing, deploying and improving machine learning models as services in collaboration with data scientists, and finding new and better ways to process our growing volumes of data.

You’ll be comfortable writing modular code and thinking about how your work fits into the big picture, and collaborating with other teams to help them make effective use of data to drive decisions.

You’ll have strong communication skills, and be comfortable discussing problems and solutions with your team-mates. You’ll be asked to give your input & ideas to help make decisions and shape features via planning, story mapping and other product development activities.

You’ll enjoy learning, teaching & sharing your experience with your colleagues in various ways; we encourage code review, pairing, mentoring, giving (and watching!) regular lightning talks, and getting & giving regular feedback.


About you

We're looking for software engineers who are comfortable writing clean, performant, and readable SQL, and who can write robust, well-factored code in a general purpose programming language. We primarily use Ruby and Python but we are happy to consider applicants whose experience is in other programming languages.

Ideally you’ll have previous experience building, supporting and deploying a data warehouse and ETL pipeline, taking into account performance, security and maintainability.

Above all, we are looking for people who are curious, think critically, are eager to learn and keen to use their experience to help and support others. You will need to be able to communicate and explain things clearly and work well in a collaborative environment.



Please use our online form by pressing 'Apply for this job' below, including your CV and a cover letter telling us why you'd like to come work with us.

How we assess candidates

We use a set of competencies to evaluate candidates throughout the interview process: communication, initiative, teamwork, curiosity and technical skill. You can read more about these in our blog post about our hiring framework.

We’ve also written a post about what to expect from your interviews at FutureLearn if you’d like to find out more about the next stages of the process


Please contact [email protected] if you require any reasonable adjustments or alterations to be made, to support you through the recruitment process.

FutureLearn is an equal opportunities organisation who value diversity, promote equality and challenge discrimination. We are especially keen to encourage applications from people currently under-represented, including those from the LGBT+ community, neurodiverse people, people with disabilities, and those from a Black, Asian or Minority Ethnic background.


We value diversity at FutureLearn, and we do not discriminate on the basis of race, religion or belief, gender, sexual orientation, gender assignment, age, pregnancy, maternity and paternity status, or disability status, marital and civil partnership status.

Apply for this job