Research Associates In Artificial Intelligence For Data Analytics 

London (GB)
21 Sep 2017
End of advertisement period
23 Oct 2017
Contract Type
Full Time

Opportunity for 2 Research Associates to work on advanced methods from artificial intelligence and machine learning to help automate the data analytics process. These posts will be based at the Alan Turing Institute hub at the British Library in London.

JOB TITLE Research Associate
SALARY £35,000 per annum (negotiable dependent on skills & experience)
HOURS Full time
CLOSES   23rd October 2017

Data analytics, the process of transforming a raw dataset into useful knowledge, can be a painstaking and expensive process, but often the majority of effort lies not in building statistical models, or training machine learning algorithms, but in all the other tasks that go into in data preparation, exploration, and interpretation. The Artificial Intelligence for Data Analytics (AIDA) project at the Alan Turing Institute is an ambitious effort to develop an integrated artificial intelligence system that guides the user through every step of the process, magnifying the productivity of working data scientists.
In the first phase of the AIDA project, we will develop new methods for data ‘wrangling’, an often laborious and time-consuming process that accounts for up to 80% of a typical data science project. Data wrangling includes understanding what data is available, integrating data from multiple sources, identifying missing, messy or anomalous data, and extracting features in order to prepare data for computer modelling.

The AIDA research project will not only aim at new advances in artificial intelligence and machine learning to address data wrangling issues; we also aim to develop systems that help to automate each stage of the data analytics process.  It is anticipated that the resulting technology will benefit researchers, industry and government, dramatically improve the productivity of working data scientists, and revolutionise the speed and efficiency with which data can be transformed into useful knowledge.

The AIDA team will consist of five investigators (Chris Williams, James Geddes, Zoubin Ghahramani, Ian Horrocks, Charles Sutton), three research assistants, software engineers, data scientists, and aligned PhD students.
We invite applications from talented and qualified researchers with a strong research background in machine learning, data mining, or semantic technologies to become part of the AIDA research project. These posts will be based at the Alan Turing Institute hub in London.

This is a readvertisement. Unsuccessful candidates from the previous round should not re-apply.


The AIDA project will build an intelligent data analytics system that guides an analyst through a semi-automated process of acquiring, preparing, integrating, transforming, cleaning, and understanding data for analysis. AIDA will aid data preparation by combining technologies from logical artificial intelligence and statistical machine learning. Logic-based AI contributes by providing a powerful set of tools for integrating and representing metadata about data whose complexity and heterogeneity makes it impossible to represent as a set of relational tables. Statistical machine learning provides a powerful set of techniques for inferring what is “typical” for a data source, which can underlie new techniques for identifying low-quality data, for evaluating the effect of data transformations, and for summarizing data by reporting on typical behaviour.

The two Research Associates (RAs) will take on different aspects of the work programme as described below. They will work as part of the AIDA team, and collaborate with the software engineers in order to create a unified system. These roles are initially on a three year fixed term contract.
RA1 will work on Data Acquisition and Transformation. The goal will be to develop a data acquisition component (DAC) that uses semantic technology to support analytics tools in the acquisition and transformation of relevant data from large, distributed, dynamic, and heterogeneous data sources. The DAC will provide comprehensive data restructuring functionality, which will require the extension of semantic technologies with features such as aggregation. We will also investigate the use of data analysis techniques to reveal hidden/lost data structure, and the use of axiomatic data structure to support data understanding, cleaning and analysis.

RA2 will work on Data Understanding, Data Quality, and Cleaning. The main tasks in data understanding will be around helping the data analyst to build their intuition about what is in the data, looking both for common patterns and anomalies.  This will include work using techniques from statistical machine learning, deep learning, and related areas to infer compact summaries of datasets, automatically inferring the types of variables from data, and producing simple reports describing statistically reliable patterns in data.  For data quality and cleaning, important tasks include handling missing data, entity disambiguation, and identifying anomalies, and passing on possible uncertainty in these components to later stages of the analysis. We will explore ways of developing new data quality methods, exploiting particularly the methods of statistical machine learning, possibly in combination with semantic technologies.


Launched in November 2015, the Alan Turing Institute is the national institute for data science. Our mission is to make great leaps in data science research to change the world for the better.

The Institute is headquartered at The British Library in London, and brings together researchers from a range of disciplines — mathematics, statistics, computing, engineering and social sciences — from five leading universities and industry partners.


This is an exciting opportunity to join a diverse, growing organisation with a strong team of individuals at the helm.

Further information about the role, duties and responsibilities can be found on the Turing website and person specification: AIDA.

