We believe that all researchers should share their analytic code. Open sharing provides an unambiguous record of the analytical methods used, aiding reproducibility and error spotting. It also permits efficient re-use by other researchers. This is why we share all source code for all our projects, including OpenPrescribing, OpenPathology, OpenSAFELY and FDAAA Trials tracker is available on our main GitHub account. In addition, every paper we publish has its code shared, linked to from the page, and usually in a Jupyter notebook, where you can see exactly how the code runs against the underlying data. This is all part of our open working ethos, and our other core value: combining best practice from the academic and software development communities.

We want to help others work in this way. We have a range of resources we use to onboard external researchers, and we are currently developing a course on Open Analytic Methods for Health Data Analysis. If you are interested in learning how to use Jupyter Notebooks, Github and other tools then please get in touch, we will add you to our contact list for the course.

Check out some of our main GitHub repositories below!

OpenPathology

This is the code related to our OpenPathology project. Specifically this repo stores ad-hoc analyses, papers, and related research. The code for the website (and online tool, when developed) are in their own repository.

Read more about OpenPathology

OpenPrescribing

This is the website code for https://openprescribing.net - a Django application that provides a REST API and dashboards for the English Prescribing Dataset published by the NHS Business Services Authority. Information about data sources used on OpenPrescribing can be found here.

Read more about OpenPrescribing

OpenSAFELY Cohort Extractor

This is the code for the OpenSAFELY cohort extractor tool which supports the authoring of OpenSAFELY-compliant research, by: Allowing developers to generate random data based on their study expectations. They can then use this as input data when developing analytic models. Supporting downloading of codelist CSVs from the OpenSAFELY codelists repository, for incorporation into the study definition Providing tools to understand and visualise the properties of real data, without having direct access to it It is also the mechanism by which cohorts are extracted from live database backends within the OpenSAFELY framework.

Read more about OpenSAFELY Cohort Extractor

OpenSAFELY Job Runner

This is the repository for the OpenSAFELY job runner. A job runner is a service that encapsulates: the task of checking out an OpenSAFELY study repo; executing actions defined in its project.yaml configuration file when requested via a jobs queue; and storing its results in a particular locations. The documentation is aimed at developers looking for an overview of how the system works. It also has some parts relevant for end users, particularly the project.

Read more about OpenSAFELY Job Runner

OpenSAFELY Job Server

This is the code for the OpenSAFELY job server designed for mediating jobs that can be run in an OpenSAFELY secure environment. The Django app provides a simple REST API which provides a channel for communicating between low-security environments (which can request that jobs be run) and high-security environments (where jobs are run).

Read more about OpenSAFELY Job Server

COVID-19 TrialsTracker

This repository contains the data cleaning notebook, all necessary datasets, and the code for running the COVID-19 TrialsTracker website at covid19.trialstracker.net. Docker files are included to ensure a consistent environment for reproducibility.

Read more about COVID-19 TrialsTracker

Data Processing and Analysis Code for Results Reporting Trends on ClinicalTrials.gov under FDAAA 2007

This is the repository containing everything you need to recreate our analysis published in The Lancet assessing compliance with the Final Rule of The Food and Drug Administration Amendments Act (FDAAA) (2007). The code can also be easily adapted for future analyses of interest using ClinicalTrials.gov data.

Read more about Data Processing and Analysis Code for Results Reporting Trends on ClinicalTrials.gov under FDAAA 2007

EUCTR Trials Tracker

This repository contains all the data extraction and front-end code for EU Trials Tracker.

Read more about EUCTR Trials Tracker

TrialsTracker

This repository contains all the analysis and front-end code for trialstracker.ebmdatalab.net which is a simple application that tracks major trial sponsors with unreported trials on ClinicalTrials.gov.

Read more about TrialsTracker