OpenSAFELY: The Origin Story

7 May 2021  |  Jessica Morley

On 7th May 2020, the OpenSAFELY Collaborative pre-printed the world's largest study into factors associated with death from Covid-19, based on an analysis running across the full pseudonymised health records of 40% of the English population. This is an unprecedented scale of data. It was only made possible because of a huge collaboration including our group (the DataLab at the University of Oxford), the EHR research group at London School of Hygiene and Tropical Medicine, NHS England, and TPP. Over 42 days during the peak of the first wave of COVID-19 this team worked day and night to produce a fully open-source, privacy-preserving software platform, capable of running open and reproducible analytics across electronic health records, all held securely in situ. Since then the OpenSAFELY platform has expanded to a full scale analytic environment for secure data analysis, reproducible data curation, federated analysis, and code sharing, with every line of code for the platform, for data management, and for data analysis all shared openly by default, in re-usable forms, automatically, and without exception.


Citing and Crediting Codelists: A discussion for the research community

16 February 2021  |  Caroline Morton, Jessica Morley

This is a draft discussion paper, the first of a series exploring “open team science” approaches to managing health data, and specifically how to create a collaborative computational data science ecosystem where the sharing and re-use of objects such as codelists and code is facilitated, encouraged, recognised, and rewarded. As a microcosm of this we have first explored “codelists”. There are currently no ‘answers’ or preferred solutions given. We will be holding an open discussion with the research community on 2nd March at 3pm - you can book to join us here.


We're Hiring: Platform Lead (Python)

17 December 2020  |  DataLab

Job Title: Platform Lead (Python).
Salary: £48,114 to £55,750 per annum, plus market supplement up to £14,095.
Contract: Fixed-term for 18 months in the first instance.
Hours: Up to full-time, flexible and part-time working acceptable.
Closing Date: noon on Monday 11 January 2021.
Interview Date: 27/28 January 2021
Vacancy ID: 147140.
Apply: > search all jobs > search by "DataLab" or vacancy ID


We're Hiring: Software Developers x 2

17 December 2020  |  DataLab

Job Title: Software Developers (2 posts: Python and Frontend/UI).
Salary: £41,526 - £54,131 pa, with the possibility of a market supplement up to £2869.
Contract: fixed term 18 months.
Hours: Up to full-time, flexible and part-time working acceptable.
Closing date: noon on Monday 11 January 2021.
Interview date: 27/28th January 2021.
Vacancy ID: 148913.
Apply: > search all jobs > search by "DataLab" or vacancy ID


We're Hiring: Epidemiologist / Health Data Scientist

17 December 2020  |  DataLab

Job Title: Epidemiologist / Health Data Scientist for OpenSAFELY in the DataLab.
Salary Range: £41,526 - £49,553, with discretionary range to £54,131.
Contract: fixed-term for 18months in the first instance.
Hours: Up to full-time, flexible and part-time working acceptable.
Closing Date: Friday 8th January 2021.
Interview Date: 21 January 2021.
Vacancy ID: 148953
To apply: > search all jobs > search by "DataLab" or vacancy ID


OpenPrescribing Newsletter November 2020

3 December 2020  |  DataLab

We have been very busy since our last newsletter back in July and there are tonnes of exciting updates for you here!

Measure Update: Total Oral Morphine Equivalence
The Faculty of Pain Medicine has recently updated their recommendation on oral morphine equivalence (OME) which we use on our OpenPrescribing measure of OME. We have taken this opportunity to update and a new novel implementation of how we assess OME. Until this work is completed we have taken the decision to “suspend” the measure from dashboards however you can still view the old method using this link.


OpenSAFELY Cohort Extractor

7 October 2020  |  DataLab

This is the code for the OpenSAFELY cohort extractor tool which supports the authoring of OpenSAFELY-compliant research, by:

  1. Allowing developers to generate random data based on their study expectations. They can then use this as input data when developing analytic models.

  2. Supporting downloading of codelist CSVs from the OpenSAFELY codelists repository, for incorporation into the study definition


What is OpenSAFELY?

7 October 2020  |  Ben Goldacre, OpenSAFELY Collaborative, Seb Bacon, William Hulme

What is OpenSAFELY

Working on behalf of NHS England we have now built a full, open source, highly secure analytics platform running across the full pseudonymised primary care records of 24 million people, rising soon to 55 million, 95% of the population of England. We have pursued a new model: for privacy, security, low cost, and near-real-time data access, we have built the analytics platform inside the EHR data centre of the major EHR providers, where the data already resides; in addition we have built software that uses tiered increasingly non-disclosive tables to prevent researchers ever needing direct access to the disclosive underlying data to run analyses; code is developed against simulated data using open platforms before moving to the live data environment. Everything has run smoothly. We are fully live inside TPP; we are signed off with full data access and end-stage tech development for the computational platform with EMIS.


OpenSAFELY Job Runner

7 October 2020  |  DataLab

This is the repository for the OpenSAFELY job runner. A job runner is a service that encapsulates: the task of checking out an OpenSAFELY study repo; executing actions defined in its project.yaml configuration file when requested via a jobs queue; and storing its results in a particular locations.

The documentation is aimed at developers looking for an overview of how the system works. It also has some parts relevant for end users, particularly the project.yaml documentation.

OpenSAFELY Job Server

7 October 2020  |  DataLab

This is the code for the OpenSAFELY job server designed for mediating jobs that can be run in an OpenSAFELY secure environment. The Django app provides a simple REST API which provides a channel for communicating between low-security environments (which can request that jobs be run) and high-security environments (where jobs are run).

OpenPrescribing July Newsletter

4 August 2020  |  DataLab

OpenPrescribing and DataLab Papers

It has been a busy month for paper publication at The DataLab. We have written a brief description of the most recent papers below. Please sharewith colleagues and [get in touch(mailto:[email protected])if you have any relevant observations! Remember you can read all our academic papers related to OpenPrescribing on our research page.


OpenPrescribing June Newsletter

23 June 2020  |  DataLab

Methotrexate Prescribing Safety – New paper in BJGP

This week the British Journal of General Practice published our latest paper on unsafe prescribing of methotrexate. We found that the prevalence of unsafe methotrexate prescribing (10mg tablets) has reduced but remains common, with substantial variation between practices and CCGs. In the paper we also discuss recommendations for better strategies around implementation.


OpenPrescribing Newsletter May 2020

28 May 2020  |  DataLab

OpenSAFELY is a new secure analytics platform for electronic health records in the NHS, created to deliver urgent results during the global COVID-19 emergency. OpenSAFELY is a collaboration between the DataLab, the EHR group at London School of Hygiene and Tropical Medicine and TPP who produce SystmOne. OpenSAFELY is now successfully delivering analyses across more than 24 million patients’ full pseudonymised primary care NHS records. The first analysis from OpenSAFELY is Factors associated with COVID-19-related hospital death in the linked electronic health records of 17 million adult NHS patients with more answers to important questions expected shortly.


Impact of COVID-19 on prescribing in English general practice: March 2020

22 May 2020  |  Brian MacKenna has been updated this week with the latest release of prescribing data covering March 2020. In-depth analysis will be needed over the coming months, but this release gives us the first glimpse into the impact that COVID-19 has had on prescribing. At the DataLab we have been quite busy with the new secure analytics platform OpenSAFELY but the following blog is a rapid analysis of the March prescribing data which others may find helpful to focus their own investigations. As always, all our analytical code is openly available on our GitHub for inspection and reuse by anyone.