Data Scientist

London (Central), London (Greater)
£41,386 - £48,414 per annum, including London Weighting Allowance.
01 Jun 2023
End of advertisement period
14 Jun 2023
Academic Discipline
Clinical, Pre-clinical & Health
Contract Type
Fixed Term
Full Time

Job description

The Department of Twin Research & Genetic Epidemiology holds 30 years of data gathered from various sources on over 15,000 participants of its TwinsUK cohort. It is one of the most deeply characterised adult twin cohorts in the world, providing a rich platform for scientists to research health and ageing longitudinally.   

TwinsUK has recently applied for access to the study participants’ health, educational and environmental records so that they can be linked to the vast collection of longitudinal omic and phenotypic data amassed in the past 30 years, leading to a huge and centralised resource of health research data.   

In addition in response to the COVID-19 pandemic, TwinsUK joined the Government funded National Core Studies (NCS) programme and became an active member of the UK Longitudinal Linkage Collaboration (UK LLC). As a result, TwinsUK is engaging in a national effort to enable data linkage to study participants’ official health, educational and environmental records.  

The postholder will have a pivotal role in receiving, cleaning, harmonising, documenting, storing, and curating these linked records and the data collected by TwinsUK. They will be responsible for creating automated or semi-automated tools to process these linked data and making them available to approved, bona-fide researchers within a Trusted Research Environment (TRE) and in segregated project spaces.   

The postholder will be part of a highly collaborative and inclusive team, working under the supervision of the PI and Head of Data and working closely with the Data Manager. Proficiency in the use of programming tools for data and database manipulation is essential. The applicant will provide high quality general operations coordination & support to the Data team, the PIs and wider professional services team in the Department and School.  

Applicants will have a strong interest & knowledge of clinical research data, a proactive attitude and excellent organisational & planning skills.   

The postholder will:   

•        Be familiar with ETL techniques in order to receive, extract, manipulate, clean and process raw research data to be harmonised with other datasets and made ready for sharing with researchers  

•        Have high level expertise in data manipulation tools such as Microsoft Excel, SPSS, Stata and be able to write macros and scripts  

•        Be proficient in the use of programming tools such as Python and R to automate data manipulation tasks  

•        Be able to use the Azure platform and MS SQL Server DBMS to store data and automate data extraction tasks for other data team members  

•        Facilitate the development and implementation of data linkage strategies and processes  

•        Produce descriptions and define metadata for new and existing datasets  

•        Be required to program data extraction out of and data injection into online study databases designed in REDCap  

•        Be familiar with the need for and the usage of Trusted Research Environments (TREs)   

We value your professional growth, and you will have opportunities to attend conferences & training. This post is based at St. Thomas’ Hospital but a hybrid working environment will be offered if required.   

This post will be offered on an a fixed-term contract for 18 months 

This is a full-time  post - 100% full time equivalent

Key responsibilities

•              Using ETL techniques, receive and interrogate large amounts of data from different sources, ensuring accuracy and consistency, and storing in the TwinsUK database.    

•              Use advanced skills in MS Excel, SPSS and Stata, including writing macros and developing scripting techniques.  

•              Use programming tools such as Python and R to automate data manipulation tasks as required by the wider data team  

•              Use the Azure platform and MS SQL Server DBMS to store data and automate data extraction tasks   

•              Participant and clinic-based data collection – Provide tools to aid data collection, injection and extraction to and from TwinsUK online REDCap databases  

•              Look for data trends and patterns with excellent attention to detail  

•              Help develop procedures to acquire and integrate the official records into the TwinsUK databases as per the guidelines agreed with NHS Digital and the ONS   

•              Help with the administration of data in TREs for researchers. Develop procedures, tools and/or scripts to check export data files and ensure no identifiable data is exported when the researcher requests output of resultant data     

•              Handle personal information, adhering to safe and secure data governance, in line with protocols and current data protection legislation  

•              Identify and gather the metadata for the official health, education and environmental records  

•              Draft & prepare reports and presentations where appropriate  

•              Maintain excellent internal & external working relationships  

•              Promote collaborative work within the data linkage team, the wider department, and other collaborating organisations  

•              Attend wider data linkage meetings and workshops with external collaborators, such as the UKLLC, a multi collaboration of national cohorts.  

•              Travel to other cohorts/research sites for meetings as necessary  

•              Design & deliver presentations and progress reports including recommendation and conclusion, advising the manager where necessary   

•              Be able to adapt communication style according to the given audience, demonstrating comprehension and confidence   

The above list of responsibilities may not be exhaustive, and the post holder will be required to undertake such tasks and responsibilities as may reasonably be expected within the scope and grading of the post.  

Skills, knowledge, and experience 

Essential criteria  

1.      Educated to UG/PG degree level or equivalent relevant experience  

2.      Working knowledge of data manipulations tools with high level expertise of the following: MS Excel, SPSS, Stata  

3.      Experience in ETL techniques and methods  

4.      Working knowledge of Azure and DBMS such as MS Access with high level expertise in MS SQL Server including writing queries, views and stored procedures  

5.      Working knowledge of programming tools such as R and Python for the automation of data processing and analysis tasks  

6.      Experience of using REDCap  

7.      Highly numerate with strong analytical, statistical and problem-solving skills  

8.      Experience of data analysis, with ability to produce, interpret, analyse and present qualitative and quantitative data, using a range of techniques, including visualisation methods that improve understanding of the evidence base  

9.      Understanding of and adherence to data governance and confidentiality legislation and practice  

10.   Excellent written and oral communication skills, including report/protocol writing and presenting, to convey complex information to a non-specialist audience through clear and accessible formats  

11.   Highly personable with experience of working with stakeholders at all levels, using excellent interpersonal skills, providing excellent customer service and building effective networks across complex organisations  

12.   A commitment to equality, diversity and inclusion, actively addressing areas of potential bias  

Desirable criteria

1.      PhD in relevant discipline  

2.      Prior experience in project coordination  

3.      Prior experience in working in Trusted Research Environments (TREs)  

4.      Previous experience in health-related research projects   

5.      Knowledge of data visualisation techniques using software packages, such as Power BI and Tableau  

Similar jobs

Similar jobs